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Preface 



The 2nd International Conference on Information Security and Cryptology 
(ICISC) was sponsored by the Korea Institute of Information Security and Cryp- 
tology (KIISC). It took place at Korea University, Seoul, Korea, December 9-10, 
1999. Jong In Lee of Korea University was responsible for the organization. 

The call for papers brought 61 papers from 10 countries on four continents. As 
in the last year the review process was totally blind. The information about au- 
thors or their affiliation was not given to Technical Program Committee (TPC) 
members. Each TPC member was random-coded and did not even know who 
was reviewing which paper. The 23 TPC members finally selected 20 top-quality 
papers for presentation at ICISC 1999 together with one invited talk. Serge 
Vaudenay gave an invited talk on “Provable Security for Conventional Cryptog- 
raphy” . 

Many people contributed to ICISC’99. First of all I would like to thank all the 
authors who submitted papers. I am grateful to the TPC members for their hard 
work reviewing the papers and the Organization Committee members for all the 
supporting activities which made ICISC’99 a success. I would like to thank the 
Ministry of Information and Communication of Korea (MIC) which financially 
sponsored ICISC’99. Special thanks go to Pil Joong Lee and Heung Youl Youm 
who helped me during the whole process of preparation for the conference. Last, 
but not least, I thank my students, KyuMan Ko, Sungkyu Chie, and Chan Yoon 
Jung. 
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On Provable Security for Conventional 
Cryptography 



Serge Vaudenay 

Swiss Federal Institute of Technologies (EPFL) 
Serge . Vaudenay @epfl . ch 



Abstract Many previous results on the provable security of conven- 
tional cryptography have been published so far. We provide here handy 
tools based on Decorrelation Theory for dealing with them and we show 
how to make their proof easier. As an illustration we survey a few of 
these results and we (im)prove some by our technique. 

This paper covers results on pseudorandomness of some block cipher 
constructions and on message authentication code constructions. 



Decorrelation theory was introduced in [18]-[25]. Its first aim was to address 
provable security in the area of block ciphers in order to prove their security 
against differential and linear cryptanalysis. As a matter of fact, these techniques 
can also be used for other areas of conventional cryptography as shown in this 
paper. 

In [25] was noticed that decorrelation distances of some integral order d was 
linked to the advantage of the best attacks which is limited to d samples in several 
classes of attacks. Namely, non-adaptive attacks was characterized by decorrel- 
ation distance with the |||.|||oo norm, chosen input attacks was characterized by 
decorrelation distance with the ||.||a norm, and chosen input and output attacks 
was characterized by decorrelation distance with the ||.||s norm. This can be 
used to address provable security of, say, MAC construction schemes. Due to 
nice properties of decorrelation distances, some previous results turn out to get 
simpler and more systematic. 

A similar systematic approach was recently addressed by Maurer [11] and it 
would be interesting to compare both approaches. 

1 Definitions and Properties 

This section recalls basic facts in decorrelation theory. 

1.1 Definitions and Notations 

First of all, for any random function F from a set At i to a set At 2 and any 
integer d we associate the “d-wise distribution matrix” which is denoted [A]"^, 
defined in the matrix set j^y 

=Pr[d^(a;i) = yi,...,F{xd) = yd]- 

JooSeok Song (Ed.): ICISC’99, LNCS 1787, pp. 1-16, 2000. 
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Given a metric structure D in we can define the distance between 

the matrices associated to two random functions F and G. This is the “d-wise 
decorrelation distance” . If G is a random function uniformly distributed in the set 
of all functions from AIi to At 2 (we let F* denote such a function), this distance 
is called the “d-wise decorrelation bias of function F” and denoted DecF^(F). 
When F is a permutation (which will usually be denoted C as for “Cipher”) 
and G is a uniformly distributed permutation (denoted G*) it is called the “d- 
wise decorrelation bias of permutation F” and denoted DecP^(F). In previous 
results we used the metric structures defined by the norms denoted ||.||2 (see 
[20]), |||.|||oo, 1 1 -Ha, 1 1 -I Is (see [25]). These four norms are matrix norms, which 
means that they are norms on with the property that 

||AxF||<||A||.||F||. 

This property leads to non-trivial inequalities which can shorten many treat- 
ments on the security of conventional cryptography. 

Given two random functions F and G from Aii to M 2 we call “distinguisher 
between F and G” any oracle Turing machine which can send Ali-element 
queries to the oracle O and receive AI 2 -element responses, and which finally 
output 0 or 1. In particular the Turing machine can be probabilistic. In the fol- 
lowing, the number of queries to the oracle will be limited to d. The distributions 
on F and G induces a distribution on and A'^ , thus we can compute the 
probability that these probabilistic Turing machines output 1. The advantage 
for distinguishing F from G is 

Adv^(F, G) = Pr [A^ = 1] - Pr [A^ = l] . 

We consider the class of non-adaptive distinguishers limited to d queries, 
which are distinguishers who must commit to d queries before receiving the 
responses. Similarly, we consider its extension Gl[][ of all distinguishers limited 
to d queries. For instance, these distinguishers can choose the second query with 
the hint of the first response. Finally, when F and G are permutations, we 
also consider the extension Gif of distinguishers limited to d queries but who 
can query either the function F/G or its inverse F~^/G~^. For any class of 
distinguishers G1 we will denote 

BestAdv(F, G) = max Advyi(F, G). 

Cl «4 .gC1 

We notice that if A is a distinguisher, we can always define a complementary 
distinguisher A = 1 — A which gives the opposite output. There is no need 
for investigating the minimum advantage when the class is closed under the 
complement (which is the case of the above classes) since 



Adv_ 4 -(F, G) = -Adv^(F, G). 
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1.2 Properties 

The d-wise distribution matrices have the property that if F and G are inde- 
pendent random functions, F from AI 2 to M 3 and G from M\io M 2 , then 

[FoGf= [G]"*x [Ff. 

As an illustrative consequence, if F* is a uniformly distributed random function, 
we obtain that 

[F]‘^ X [F*]‘^ = [F*]‘^ x[F]‘^= [F*]‘^. 

Thus, if we are using a matrix norm ||.||, we obtain 

DecFf|,||(FoG) < DecFf|,||(F).DecFf|,||(G). 

and the same for permutations. In the sequel we will refer to the multiplicative 
property of decorrelation biases. 

The |||.|||oo, 1 1 -Ha, 1 1 -I Is have the quite interesting property that they charac- 
terize the best advantage of a distinguisher in CY^^, Cl|^ or Clf. 

Lemma 1 ([18,25]). For any random functions F and G we have 
|||[i"]‘'- [Gj-'IIU = 2.BestAdv(F,G) 

1 11^]''- [G]‘^|U = 2.BestAdv(F,G) 

Cla 

and when F and G are permutations we also have 

1 1 [F]-^ - [Gj-^l U = 2. BestAdv(F, G). 

cif 

This is quite a useful property which may lead to some non-trivial inequalities. 
For example, if F\, ..., F^ are r independent identically distributed random 
functions from M to itself, and if F* denotes a uniformly distributed random 
function, we have 

BestAdv(Fi o ...oFr,F*) < 2’'"^ ( BestAdv(Fi, F*) 

Cla V 

There is a simple link between ||.||s and ||.||a. Actually, if for any random 
permutation G on At we let G be defined from {0,l}xAltoAlby 

G(0, x) = G(x) and G(l, x) = G~\x). 

Obviously we have 

||[Gl]-^-[G2]1|s=||[Gi]-^-[G2]1|a. (1) 

Finally we recall the following lemma. 
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Lemma 2 ([25]). Let d be an integer, F\, ... ,Fj. her independent random func- 
tion oracles, and Ci, . . ., Cg, D\, . . . , Dt be s-\-t independent random permutation 
oracles. We let oracle which can access to the pre- 

vious oracles and from each query x defines an output G(x). We assume that L2 
is such that the number of queries to Fi and Cj is limited to some integer Oi and 
bj respectively, and the number of queries to Dk or is limited to Ck in total 
for any i = 1, . . . ,r, j = 1, . . . , s and k = 1, . . .,t. We let the F* (resp. C* , D^) 
he independent uniformly distributed random functions (resp. permutations) on 
the same range than Fi (resp. Cj, Dk) and we let G* the function defined by 

r s t 

DecFfl.ll JG) < +Y.^ecP';(l{C,) + ^ + 

j^l k^l 

DecFfl.llJG*). 

In addition, if the 12 construction defines a permutation G, assuming that com- 
puting G~^ leads to the same Oj, hj and Ck limits, we have 

r s t 

DecFf|,||^(G) < + E + E + 

i—1 j — 1 k — 1 

DecFf|.|u(G*). 

This lemma actually separates the problem of studying the decorrelation bias of 
a construction scheme into the problem of studying the decorrelation biases of 
its internal primitives Fi, Cj and Dk and studying the decorrelation bias of an 
ideal version G* with truly random functions inside. 

2 Randomness of Cryptographic Primitives Designs 

2.1 The Luby Rackoff Result 

Many previous papers on cryptography investigated how much randomness 
provides such or such design scheme. The Luby-Rackoff result [8] addressed 
the Feistel construction [5] with truly random rounds. By translating it into our 
formalism, it turns into the following lemma. 

Lemma 3 (Luby— Rackoff 1986 [8]). Let Ff,Ff,F^,Ff he four independent 
random function on {0, 1}“ with uniform distribution. We have 

DecFf|,|lJ'F(F*,F*,F3*)) < 2d^.2~^ 
DecPl^^^{F{Ff,F*,F;)) < 2d\2~^ 
DecPfj,||^('F(F*,F2*,F3*,F4*)) < 2d\2~^ . 

The results hold for Feistel schemes defined from any (quasi) group operation. 
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(Here 9{Fi, . . .,Fr) is the standard notation introduced by Luby and Rackoff 
in order to denote a Feistel scheme in which the tth round function is Fi.) This 
result can be used together with Lemma 2 in order to study the decorrelation 
bias of a Feistel scheme with independent rounds. This is the basis of the Peanut 
construction over which the AES candidate DFC [2,6,7] is based. 

2.2 General Approach 

Many other similar results (and extensions) followed the Luby-Rackoff paper. 
For instance Zheng-Matsumoto-Imai [27] and Patarin’s Thesis [13]. We show 
here how we can make this kind of result easier by using a technique inspired by 
both Patarin’s “F[ coefficient method” [13,14] and Maurer [10]. 

We aim to upper bound the decorrelation bias of a given random function. 
Our paradigm consists in first proving a combinatorial lemma by using the struc- 
ture of the random function then using a standard proof given by the following 
lemma. 

Lemma 4. Let d be an integer. Let F be a random function from a set M.\ 
to a set Ai 2 - We let X be the subset of Aif of all {xi, . . . , Xd) with pairwise 
different entries. We let F* be a uniformly distributed random function from 
Ml to M 2 . We know that for all x G X and y € M^ the value [F*]'f^y is a 
constant po = {ffM2)~‘^ . We assume there exists a subset y C M2 and two 
positive real values e\ and £2 such that 

— \y\po > 1 - £i 

-VxGA VyGjf [F]i^y>PQ{l-t2)- 
Then we have DecF[j |I^(F) < 2ei -|- 2e2. 

This lemma intuitively means that if [F]‘f^y is close to [F*]‘f^y for all x and almost 
all y, then the decorrelation bias of F is small. 

We have a twin lemma for the jj.jjg norm. Here, since we can query y as well, 
the approximation must hold for all x and y. 

Lemma 5. Let d be an integer. Let C be a random permutation on a set M. 
We let X be the subset of M‘^ of all {x \, . . . , Xd) with pairwise different entries. 
We let F* be a uniformly distributed random function on M . We let C* be a 
uniformly distributed random permutation on M . We have 

— if [C]f.^y > [G*]^ j,(l — e) for all x and y in X then DecP[j ||^(F) < 2e 

— if [C]'f^y > [F*]'f^y{l — e) for all x and y in X then DecP[j |I^(F) < 2e -|- 
2d'^{#M)-^. 

Proof. We use the characterization of DecF[j in term of best adaptive dis- 
tinguisher. We let A be one d-limited distinguisher between F and F* with 
maximum advantage. We can assume w.l.o.g. that all queries to the oracle are 
pairwise different (we can simulate the distinguisher by replacing repeated quer- 
ies by dummy queries). The behavior of A is deterministically defined by the 
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initial random tape w and the oracle responses y = (yi, . . yd). We let xi de- 
notes the tth query defined by (w, y). It actually depends on to and yi, . . . , yi-\ 
only. We let x = {x \, . . . , Xd) which is assumed to be in d'. We let A be the set 
of all (cj, y) for which A outputs 0. It is straightforward that 

Adv^(F,F*) = - ^ 

{uj,y)&A 

Next we have 

Adv^(F,F*)< ^ Pr[u;]e 2 [F*]i^+ ^ Pr[u;][F*]i^. 

{oj,y)GA {oj,y)GA 

y^y y^y 

By relaxing the (w, y) in the first sum, we observe that it is upper bounded by 
£ 2 - (We just have to sum the yjs backward, starting by summing all ydS, then 
yd-i, ■■■) For the second sum, we recall that all XiS are pairwise different, so 
[F*]i,y is always equal to po- This sum is thus less than ei. 

For the ||.||s norm, we simply use the F function as in Equation (1). In the 
case where we approximate [C]^ by [F*]'^^y, we just notice that 

lC*]iy-[F%y<lF%yd^(#M)-\ 



□ 

As an example of application (which will be used later on) we prove the 
following lemma. 

Lemma 6. For a random uniformly distributed function F* and a random uni- 
formly distributed permutation C* defined over {0, 1}™, we have 

DecP‘^(F*) = DecF‘^(C*) < d{d- 1)2-™. 

Proof. We use Lemma 4 for DecF'^(C*). We let y be equal to the full set of 
pairwise different outputs (we have ei = 0). We have 

^ _ [C*]jy ^ ^ 1 ^ d{d-l) ^_^ 

Po (1 - 2-™) . . .(1 - (d- 1).2-™) “ 2 

which gives £ 2 - n 

We can now prove Lemma 3 by using Lemma 4, 5 and 6 but with a quite 
compact proof. 

Proof (Lemma 3). Following the Feistel scheme F = F{F(, Ff,F^), we let 

Xi = i.z(, zl) 
zf = z^ + Ff{zl) 

Vi = {zf, zf) 
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We let E be the event zf = z] + F^iz'f) and z“^ = zf + F^{zf) for t = 1, . . . , d. 
We thus have [F]'^^y = Pr[i?]. We now define 

y={{yi,...,yd);yi<j zf ^ z^} . 

(This is a set of non-pathologic outputs when computing [F]'^^y.) We can easily 
check that y fulfill the requirements of Lemma 4. Firstly we have 

\y\ > (^1 - 2’”"* 

thus we let ei = . Second, for y £ y and any x (with pairwise different 

entries), we need to consider [F]’^^y. Let E^ be the event that all zfs are pairwise 
different over the distribution of F^ . We have 

[F]iy>PT[E/E^]Pr[E% 

For computing Pr[E/E'^] we know that zfs are pairwise different, as for the z|s. 
Hence Pr[_E/i?^] = 2“’”'^. It is then straightforward that Pr[i?^] > 1 — 2 ~t 
which is 1 — 62 • We thus obtain from Lemma 4 that DecF(j n^(F’) < 2d(d— 1)2“^. 
From this and Lemma 6 we thus obtain DecP(j ||^(d^) < 2d^2“^ for d < 2^+^. 
Since DecF is always less than 2, it also holds for larger d. 

Thanks to Lemma 5, the ||.||s-norm case with C = 'F(F(‘, F 2 , F^ , F^) is fairly 
similar. We let 



Xi = {Zi, zl) 
z! = z^ + F*{zI) 
Vt = {zf, zf) 
z^ = zt-Fl{zt) 



for f = 1, . . . , d. We have 



[F]i,y = Pr 



^3 = zi + F|(zf)._ 



zt = z^ + F*{z^) 



]i = l,...,d 



Let E be this event. If we let E"^ (resp. E^) be the event that all zf (resp. zf) 
are pairwise different, then 



[F]ly>Pr[E/E^,E^] Pr[E^,E\ 

The E'^ and E^ events are independent, with probability greater than 1 — 
d(d-i) 2-^ which is 1 — e. The probability of E when E^ and E^ hold is obviously 
2-md ^ So, DecPfl.ii^(F) < 2d^.2-f . □ 
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2.3 Random Permutations Collection 

We collect here a few results taken from the literature, which can be proven by 
our techniques. 

First of all, Zheng-Matsumoto-Imai investigate the pseudorandomness of gen- 
eralized Feistel transformations. Originally with two branches, they consider hav- 
ing k branches. The first generalization (“type-1 transformation”) ipii^fi, ■ ■ ■ , fr) 
is defined by 



V’l(/ 2 , • • • , fr){fl{xi) + X2, X 3 , X4, ■ ■ ■ , Xk, Xi). 

The second generalization (“type-2 transformation”) V’2(/i) • • • ) fr) is similarly 
defined for k even and r multiple of ^ by 



1p2{fl,---,fr){xi,...,Xk) = 

V’2(/| + l, • • • , fr){fl{xi) + X2, X3, f 2 {x 3 ) + X4, X5, . . . , fk{xk-l) + Xk, Xi). 
And finally the “type-3 transformation” , fr) for r multiple of k: 



1p3{fl,---,fr){xi,...,Xk) = 

i’sifk, ■ . fr){fl{xi) + X2, f2{x2) + X3, . . . , fk-l{xk-l) + Xk,Xi). 

(We also define ^e{){x) = y for £ = 1, 2, 3.) 

Lemma 7 (Zheng-Matsumoto-Imai 1989 [27]). We consider the previous 
generalizations of Feistel schemes with k branches on {0, 1}™. For independent 
uniformly distributed random functions Ff, integer d, we have 

DecP[j,|| . . . , F4_i)) < 2{k - l)d\2~^ 

DecP[j,||J'F2(Fi*,...,F*._i)) < ^d\2~^ 

DecP[j,||J'F3(F*,...,F*._i)) < (fc^ - 2fc + 2)d^.2-^. 

The results hold for generalized Feistel schemes defined from any (quasi)group 
operation. ^ 

(Note that they all generalize Lemma 3 for which k = 2.) 

Proof (Sketch). We use Lemma 4 for evaluating DecF. 

For we let y be the set of all y = {yi, . . ., yd) where yi = {yj,..., yf) such 
that we have y{ = yf for no j > 1 and i < i! . We get c\ = (k — i) 2~x . 

^ Here we slightly improved the original result for from [27] which is 
DecP||.||Jtf'3(Fi,...,Ffc*2_i)) < fc(fc- l)d^2“^. 
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We then consider the event in which the first entry after the k — 1th round takes 
pairwise different values for x\, . . . , Xd- Upper bounding the probability when 
this event occurs we get £2 = (fc— Thus DecF‘*(F) < 2{k—l)d{d — 

1)2“^. 

Similarly, for d '2 we let y be the set of all y such that we have yl = y\, for no 
even j and i < i'. We get ei = | x We consider the event in which all 

odd entries after the k — 1th round takes pairwise different values for x±, , Xd. 
We get £2 = |(fc- 1) X Thus DecF'^(F) < l)2-x. 

For 1F3 we let y be the set of all y such that we have yf = yf, for no i < i' . 
We get £1 = We consider the event in which all first fc — 1 entries 

after the k — 1th round take pairwise different values for x\,...,Xd. We get 
£2 = (fc - Thus DecF'^(F) < - 2k + 2)d{d - l)2-x. 

In the three cases, £2 is evaluated as the number of unexpected equalities 
between two outputs from a single circuit of depth k — 1 with k inputs and 
internal F* and additions times the probability it occurs, which is at most the 
depth k — 1 times 2“^. 

Now to get DecP from DecF, from Lemma 6 and the triangular inequality 
we have 



DecP"*(F) < DecF"*(F) + DecP"*(F*) < DecF"*(F) + d^2-^^. 

We then notice that the obtained upper bounds for DecF are always DecF‘*(F) < 
Ad{d — 1)2“^ for some A > 2. For d < A2'^~^ we thus obtain DecP'^(F) < 

Ad^2~^. For larger d, this bound is greater than which is greater 

than 8 since m > k > 2. Since DecP'^(F) is always less than 2, the bound is thus 
still valid. □ 



One problem proposed by Schnorr in the late 80’s was to extend the Luby- 
Rackoff result with a single random function. This has been first solved by 
Pieprzyk when he show that F*, F*, F*oF*) is pseudorandom [16]. In [14], 

Patarin improved this result by using less rounds by showing that F{F*, F*,F*o 
C o F*) is pseudorandom when C is a special fixed function (like for instance the 
bitwise circular rotation). 

Lucks [9] adopted a different approach for reducing the number of random 
function in Lemma 3: instead of making all functions depend on each other, he 
tried to “derandomize” the first and last function and to use imbalance Feistel 
schemes (with branches with different size). This work was followed by Naor and 
Reingold’s [12] who proved the following result. 

Lemma 8 (Naor- Reingold 1999 [12]). Let Ci and C 2 he two random per- 
mutations sueh that DecP^(Ci) = 0 and DecP^H IH^(C'i) < Si for i = I, 2. Let 
F* be a uniformly distributed random function. We assume that Ci, C 2 and F* 
are independent. For any integer d, we have 



DecPf! II o F{F*, F*) o Ci) < d{d - 1) 



( 5 i + 82 



UF2 



2 o-^ 



2 
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The result holds for Feistel schemes defined from any (quasi) group operation? 

The DecP^(C'i) = 0 requirement is quite easy to achieve since for any random Cq 
we can construct Cfix) = Cq{x) + K with an independent uniformly distributed 
K and we get the requirement. (For any quasigroup addition.) 

Proof (sketch). For o T{F*,F*) o Ci{xi) = yt we let Ui and Vi denote the 
inputs of the two F* functions, which are the right half of Cfixf) and the left 
half of C 2 {yi) respectively. Applying Lemma 5 we consider the event that all Ui 
and Vj are pairwise different. We get 

e = ^ Pr[wi = Uj] + ^ Pr[?;i = Vj] + ^ Pr[wi = Vj] + 
i<3 i<i i.J 

Let US first consider Pr[Mj = Uj]. We have 

Pr[wi = Uj] = ^ Pr[xi = fi, Xj = f,j] Pr[M(^i) = u{f,j)] 

where u{() denotes the right half of Ci(^) and the probabilities Pr[xi = fi, Xj = 
are taken over the distribution of the distinguisher and the oracle. Hence we 
have 

Pr[Mj = Uj] < maxPr[M(^j) = 

Now we can see that = u{fj) defines a distinguisher for C\ with two non 
adaptive deterministic queries and ^j. Hence From DecP^ (( 71 ) 111 . 111 ^ < <5i we 
can get Pr[ui = Uj] < + 2“^. The same holds for Pr[i;i = Vj]. Similarly, 

since DecP^((7i) = 0 and Ui and Vj are independent we have Pr[rtj = Vj] = 2“^. 
Hence 

e < X + d{2d - 1).2-^ + d2.2— . 

Thus the result holds when 2d <2^ . Since the bound is greater than 2^ in the 
other case, it also holds for m > 1. The trivial m = 1 case is later solved by 
inspection. □ 

^ The original result from [12] was not stated with the same hypothesis. The first 
result corresponds with c5i = ^2 = 0 and states 

DecP(j^fiCf^o^P{F*,F*)oCi) < + 2d^2“™. 

The second result needs two independent random functions Ff and Ff and the 
hypothesis that for any fixed x y the right (resp. left) halves of Ci{x) and C\{y) 
(resp. (72 (x) and C 2 {y)) are equal with probability less than 5. It states 

DecP||.||^((72"^ oF(Fi*,F2*) o (7i) < 2d^5 + 2d^2"™. 
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In addition Naor and Reingold presented an alternate scheme with parallel 
branches which achieves better decorrelation bias. 

Other alternate permutation constructions were investigated, for instance by 
Sugita [17] based on MISTY-like permutations and in [24] based on IDEA-like 
permutations. 



3 Mode of Operation and Similar Constructions 



3.1 CBC-MAC 

Another family of results was originated by Bellare-Kilian-Rogaway [4]. This 
paper consider the regular CBC-MAC construction which transforms a block 
cipher function C into a message authentication code MAC by 

MAC(mi, . . . , mi) = C{C {. . . C(mi) -I- m 2 . . .) -I- mi- 



We do not really need invertibility for C, so we can use a random function F 
instead of C. This construction is provably secure when £ is fixed, provided that 
F is already secure. This informal result can be quantified as follows. 

Lemma 9 (Bellare-Kilian-Rogaway 1994 [4]). For any fixed integer £, we 
consider the function MAC defined on £ m-hit blocks from a uniformly distributed 
random function F* as above. For any d we have DecF[j ||^(MAC) < d£{d£ — 
l)2-m 3 

When F is pseudorandom it becomes trivial from this and Lemma 2 that 

DecFfj II JMAC) < DecFf{^||_^(F) -h - 1)2-™. 

Proof (sketch). By using Lemma 4 again, we take y equal to full set oi y = 
( 2 / 1 , ... , yd). Thus we have e\ = 0. Let us define the random variable Uij as the 
input of the jth F* computation on the message Xi = {mi^i, ... ,mi^i respect- 
ively. We consider the event E that all Uij for any i and j < £ are different from 
the Uk.e for any k, and all Uk,i are pairwise different. As usual we obtain 

Pr[MAC(xd = y^;i = l,...,d]> 2-™‘^Pr[F] 

since Pr[MAC(cCi) = yi',i= 1, . . . , d/E] = 2“™'^. Thus we can take £2 = Pr[F]. 

Let Coll be the event that we have an F*-collision F*{Uij) = F*(Ur,s) with 
Uij i Ur.s for some values of i, j, r, s with j < £ and r < £. Since the number of 
queries to F* for this is d{£ — 1), we have 



Pr[Coll] < 



d{£-l){d{£-l)-l) ^_^ 

2 



Now we have 

Pr[if] < Pr[Coll] -I- Pr[if and Coll]. 
® The original result from [4] is DecFj] nJMAC) < 
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Let us study the Uij = Uk,i event with j < £ and i ^ k ii j = £ when Coll 
holds. We let r be the smallest positive integer such that rrij-r ^ me-r (with 
the convention that mo 7^ rrie-j; we notice that if j = £, since the tth and the 
fcth messages are different we must have r < £). Since we have no collision on 
F* we must have 



- either F*{Uij-r-i) + rriij-r = F*{Uk,e-r-i) + mk,e-r if r < j - 1 

- or rrii^i = F*{Uk,e-r-i) + mk,i-r ifr = j- l<£-l 

- or F*{Uk,e-r) = 0 if r = j. 

Since we have no collision all these events hold with probability at most 2“™ 
Therefore 



Pr[if] < 



{£ - l)d 



,2 , d{d-l) 



Thus we can take £2 = 



□ 



3.2 Similar Results 

Lemma 9 means that we can make one Fixed-Input-Length (FIL) MAC from one 
Single-Block-Input (SBI) MAC. There are other related results. Namely we can 
consider making Variable-Input-Length (VIL) MAC from FIL-MAC (see [1]), 
or directly from SBI-MAC. We can consider making VIL-encryption from SBI- 
encryption"* (see [3]), SBI-MAC from SBI-encryption, and so on. The technique 
is basically the same and straightforward from our presentation of Lemma 9. 

In order to define decorrelation biases for VIL-MAC, we need to face to 
the problem of having infinite sets. Let for instance A be a random function 
defined from to At 2 (Ad^ is the set of all finite sequences with entries 
in Adi). We define the matrix with rows defined on j\4f^ x ... x 

and columns defined on Adf. Next we define DecFD’ " ’‘^‘*(F) as the id- 
distance between and We can easily check that all previous 

theorems remain valid for these definitions. Additionally, we can define 

DecF‘;*i‘?(F) = max DecF^'’-’‘^‘'(F) 

^ qi+...+qd=q ^ 

and still check that BestAdvpjd,9(F, F*) = iDecFjj’^l (F) where Cl|(’‘^ is the class 
of adaptive attacks limited to d queries with a total length of q blocks. 

Here is for instance a VIL-MAC from FIL-MAC construction which improves 
the An-Bellare result [1]. 

Lemma 10. Let Fi and F2 be two independent random functions defined from 
{0, 1}™+^ to {0, 1}^. For any £ and any (mi, . . .,me) G ({0, l}™)^ we define 



MAC(mi, . ..,me) = F2(Fi(. . . Fi(Fi(0, mi), m2) . ..,me),£) 

This is usually referred to as “mode of operation” in the literature. 
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where 0 means a b-bit zero string, and £ means an m-bit string which represents 
the £ value. Considering distinguishers limited to d queries and a total length of 
qm bits we have 

DecF[j’|^ < DecF« II jFi) + DecFf|.|| JF 2 ) + q{q - l)2~^. 

Proof (sketch). As in the proof of Lemma 9 we consider the event E where all 
F 2 inputs are pairwise different. In Lemma 4 we have ei = 0 and £2 = Pr[i?]. 
If we have two equal F 2 inputs, we must have a collision on F\, thus Pr[if] < 

Note that we can still use 



MAC(mi, . . . , mi) = F 2 (Fi(. . . Fi(Fi(0, mi), m 2 ) . . . , mi)) 



with F 2 define on {0, 1}™ (the same construction, but without the message 
length) and obtain 

DecFjj’l^^ < DecF« iiJFi) +DecFf|.|lJF 2 ) +g((?+ 1)2-’”. 

(In the proof, any equal F 2 inputs lead to a collision on F\ or a preimage of 0.) 
We state an ultimate similar result with the CBC-MAC construction. 

Lemma 11 ([26]). Let Ci and C 2 be two independent uniformly distributed 
random permutations on {0, 1}™. For any I and any (mi, . . . , mi) G ({0, l}™)^ 
we define 

MAC(mi, . . . , mi) = C' 2 (C'i(. . . C'i(C'i(mi) + m 2 ) . . . + m^_i) + mi). 

Considering distinguishers limited to d queries and a total length of qm bits we 
have 

DecF‘’*’«(MAC) < d{d - 1)2"’” + q{q +!)(!+ q2-'^)2~'^ . 

The result holds for any ( quasi ) group addition. 

Proof (sketch). Using Lemma 4, let y be the set of all y = {yi,. . .,yd) with 
different yiS. We thus have ei = Now for any collection of Xi = 

(mi,i, . . . , mi^ei we let 

Ui,j = Ci(. . .Ci(C'i(mi4) + mi,2) • • • + mij-i) + mj. 

We consider the event E that all Ui^e^ are pairwise different. We have 

[MACjf,)--’^-' > 2-’”‘^(l - Pr[A]) 

therefore we can take £2 = Pr[if] = Pr[3t < r; = Ur,£r]- 

The Ui^i^ = event reduces to a collision to 

a preimage of 0 for Ci. Let Inv be the event that C\{Ui^j) = 0 for some i,j, 




14 



Serge Vaudenay 



and let Coll be the event that we have Cj j = Ur^s for some i, j, r, s such that 
rriij) ^ , nir^s)- We have £2 < Pr[Inv] + Pr[Coll]. 

The probability that any adaptive attack against Ci finds a preimage of 0 
after q queries is obviously less than 2 ^-q - Thus Pr[Inv] < 2 ^-q - 

We let 1 be the set of all {{i,j),{r,s)) pairs such that I < j < ii and 
1 < s < £r and {rrii^i, . . . , rriij) ^ (rur-q, . . . , rrir^s)- This is the set of all potential 
Uij = Ur,s collision indices. We define c(i, j, r, s) equal to the set of all {i, f) 
and {r, s') such that f < j and s' < s. We define an ordering on X by 

s)) < {{i',j'),{r',s')) 4=^ c{i,j,r, s) C c{i , j' , r' , s') . 

For {{i,j),{r,s)) G X we let Colbj_i.,s be the event that {{i,j),{r,s)) is the 
minimal pair in X such that Uij = Ur,s- We have 

Pr[Coll] < ^ max Pr[Collij'_r.,s]- 

2 ((i,j),(r,s))el 



For ((t, j), (r, s)) G X, let us consider the Colbj_i.,s event. We assume without 
loss of generality that s < j. Since we have no previous collision we must have 
rriij ^ rrir,s- Furthermore we must have Ui^j-i ^ Ur,s-i and j > 1, and we need 
to consider the event 

Cl(,Ui,j—l) ~ Ur^S' 

If Uij-i is equal to some Ui>ji with (i, j — 1) ^ {i' ,f) and {i' ,f) G c{i, j, r, s), we 
must have {{i,j — 1), (i',j')) ^ T (otherwise we have a previous collision) which 
means f = j — 1 and i' = r ^ i and (wiq, . . . , rriij-i) = (mr,i, . . . , rurj-i). If 
s < j we have Uij = Ur,s = Ui^s with ((t, j), (t, s)) G X which contradicts the 
minimality of the collision. Thus s = j, but this contradicts Ui^-i ^ Ur,s-i- 
Hence Uij-i is not equal to some with any previously used {i',j')- The 
distribution of Ci{Uij-i) is thus uniform among a set of at least T^ — q elements 
and independent of Ur,s- Hence Pr[Collij_i._s] < 2 ”^-q - 
Finally we obtain 



£2 < 



9(9-1) 



1 



- 9 



2m _ 



< 



9(9+1) 



(1 + (72-’”)2- 



□ 



4 Conclusion 

We have shown how to make several results on conventional cryptography more 
systematic by using decorrelation theory. The new presentation of the results 
helps to understand and improve them. Namely, we improved a few of the pseu- 
dorandomness results. We also improved results related to MAC constructions, 
and we proved that the encrypted CBC-MAC construction is secure. 
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Abstract. In its intended usage the lengths of the key stream sequences 
produced by the Bluetooth stream cipher Eq are strictly limited. In this 
paper the importance of this limitation is proved by showing that the 
Bluetooth stream cipher with 128 bit key can be broken in 0(2®'^) steps 
given an output key stream segment of length 0(2®'^). We also show how 
the correlation properties of the Eq combiner can be improved by making 
a small modification in the memory update function. 



1 Introduction 

Bluetooth'^'^ is a standard for wireless connectivity specified by the Bluetooth'^'^ 
Special Interest Group in [1]. The specification defines a stream cipher algorithm 
Eq to be used for point-to-point encryption between the elements of a Bluetooth 
network. The structure of Eq is a modification of a summation bit generator 
with memory. In this paper we call it the Bluetooth combiner and analyze its 
correlation properties. A few correlation theorems originating from [4] are stated 
and exploited in the analysis. Also a new kind of divide-and-conquer attack is 
introduced, which shows the importance of limiting the lengths of produced key 
stream sequences. 

As a consequence of these results, we propose a modification to the Bluetooth 
combiner. This modification could be done at no extra cost, that is, it does 
not increase the complexity of the algorithm. But, on the other hand, it would 
improve the correlation properties of the Bluetooth combiner to some extent. 
However, as long as no practical attack is known against the current version of 
the Bluetooth combiner, the results given in this paper remain theoretical. 

2 Correlation Theorems 

2.1 Definitions and Notation 

Let us introduce the notation to be used throughout this paper. We shall consider 
the field GF(2") as a linear space with a given fixed basis, and denote by Xt an 
n-dimensional vector in GF(2") as Xt = (xj, xf , . . . , cc"). The inner product 
between two vectors w = {wi, W 2 , • • • , Wn) and x = {x\, X 2 , • • • , Xn) of the space 
GF(2”) is defined as 

W ■ X = WiXi 0 W 2 X 2 © ... © WnXn- 
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The linear function Ly,{x) is then 
Lu{x) = u ■ X, u,x € GF{2'^). 

We use the same definition of correlation between two Boolean functions as in 
[3] , where it is also referred to as “normalized correlation” . 

Definition 1. Let f,g : GF(2") — > GF{2) he Boolean functions. The correlation 
between f and g is 

c{f,g) = 2-"(#{cr € GF(2") | f(x) = g{x)} - #{x G GF(2") | f{x) ^ ^(a^)})- 

Sometimes the notation Cx{f{x),g{x)) is used to emphasize the variable with 
respect to which the correlation is to be calculated. 

Finally, we recall Parseval’s theorem, which implies, in particular, that any 
Boolean function is correlated to some linear functions. 

Theorem 2. (Parseval’s Theorem) 

^ c (/, L „)2 = 1 . 

weGF{2<^) 

2.2 Correlation Theorems 

Iterated structures and combinations of transformations with common input 
are frequently seen building blocks of cryptographic algorithms. The following 
correlation theorems are useful in the analysis of propagation of correlations over 
such structures. The proofs of the theorems can be found in [4]. 

Theorem 3. Given functions f : GF(2”) x GF(2'=) ^ GF(2) and g : GF(2™) 
^ GF{2^) we set 

h{x, y) = fix, giy)), x G GF(2"), y G GF(2™). 

Then, for all u G GF(2”), v G GF(2™), 

Cx,y{h{x,y),u- x®v ■ y) = ^ Cx,zifix,z),u-x®w-z)cyiw-g{y),v-y). 

wGGF{2'^) 

We note that Theorem 3 can be considered as a generalization of Lemma 2 of 
[3]. In the second correlation theorem a Boolean function, which is a sum of two 
functions with partially common input, is considered. 

Theorem 4. Let f : GF(2”) x GF(2'=) ^ GF(2) and g : GF(2'=) x GF(2™) ^ 
GF{2) be Boolean functions. Then, for all u G GF(2"), w G GF(2’”), 

Cx,y,zif{x, y) © g{y, z),u- x®w - z) 

= X! Cx,yif{x,y),u- x®v ■ y)cy^,zigiy,z),v ■ y®w - z). 

veGF(2>‘) 

If here the two functions / and g, and the two linear combinations u and w are 
the same, we have the following corollary. 

Corollary 5. Let f : GF(2") x GF{2^) — > GF{2) he a Boolean function. Then, 
for allue GF(2”), 

Cx.y.c(/(a;,y)©/(C,2/),w(a;©^)) = ^ Cx,yif{x,y),u-x®vy)'^. 

veGF{2'^) 
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3 Combination Generators 

In [3] an example of a summation bit generator with one bit of memory is 
introduced and analyzed. The combiner of the Bluetooth key stream generator 
can be considered as a variation of the thoroughly analyzed basic summation 
bit generator, see [3] and [2]. A general class of combination generators with 
memory giving the generators of [3] and [1] as special cases is defined as follows: 

n 

Zt = ^xl®ci (1) 

i=l 

Ct = f(xt-i,Ct-i,...,Ct-d)- (2) 

Here Xt = (x^, . . xf) G GF(2”) is the fresh input to the combiner at time t 
and c° G GF{2) is the one-bit input from the memory, t = 0, 1, 2, . . .. The fresh 
input is formed by n independent sequences x* = (xq, x\, . . .), i = 1, 2, . . . , n, 
which are typically generated by n linear feedback shift registers. 

The memory constitutes of md bits arranged as a register of d consecutive 
cells of m bits each. The memory is updated by computing a new m-bit c* = 
(C( , . . . , c™) using a function / from the fresh input and from the contents of the 
memory saving the new c* in the memory and discarding Ct-d- The output bit 
Zt is computed as an xor-sum of the fresh input Xt and the previously computed 
update Ct of the memory. 

Correlation attacks aimed at recovering the keys, which determine the gen- 
eration of the fresh input, are based on correlations between a number of fresh 
input bits and the output bits. For the type of generators defined by (1) such cor- 
relations relations can be derived from correlations between consecutive “carry” 
bits c°. 

Such correlations are usually found by exhaustive search. This is the case 
also with the Bluetooth combiner which is such a relatively small system that 
this kind of “trial and error”-search is possible. In larger systems, however, some 
more sophisticated means for finding these correlation equations should be used. 
One such method is presented in [2] . 

4 Bluetooth Combiner 

Bluetooth chips are small components capable of short range communication 
with each other. The Bluetooth specification is given in [1]. In the security part 
of [I] an encryption algorithm Eq is specified to be used for protection of the 
confidentiality of the Bluetooth communication. 

The algorithm Eq is of the form specified by (l)and (2). It consists of four 
LFSRs of length 128 in total, a non-linear memory update function /, which is 
a composition of a nonlinear /i and a linear mapping T. 

The functions define the following recursive equations. The output key se- 
quence Zt, used to encipher the plaintext, is 

Zt = xl (B Xt (B Xt (B xf (B Ct, 
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where {x^ , , x^) is the fresh input at time t produced by the four LFSRs. 

Non-linearity is represented in the sequence Sj, defined by the following formula, 
where “-I-” means the ordinary integer sum: 



S* + l — + — fl{xt,Ct) 




The function /i introduces the necessary non-linearity in the system, as integer 
summation is non-linear in GF(2). The memory bits c* are then defined with 
the aid of s* as 



ct+i = (cj+i, = T{st+i,ct, ct-i) = To{st+i) 0 Ti{ct) 0 T 2 {ct-i). 

Here Tq,T\ and T 2 are linear transformations. Although non-linearity is crucial 
for security, the choice of the linear mapping T has also certain influence to the 
security of the Eq algorithm, as we will see later. 



4.1 The Mapping T 



The linear mapping T of Eq mix the old bits from the memory to the new 
updated memory bits. The main focus of this work is to investigating its role in 
the correlation properties. For the given /i we see (c.f. Table 1) that c(s°, c°_j0 
u- Xt) = 0, for all u G GF{2). Hence these are not useful in correlation attacks. 
On the other hand, c(sj, c}_i 0 0 m • xj) yf 0, i = 0, 1. 

The mapping T consists of three mappings, Tq, Ti and T 2 , as 

Ct+i = T{st+i, Ct, Ct-i) 

= 2o('St+i) 0 Ti{ct) 0 T2{ct-i). 



In matrix form, Tq = T\ = I, where / is a 2 x 2 identity matrix. Further, 




This means, the bits of c* = (c^ , c°) are 

4 = 4 ® 4-1 ® c ?_2 ( 3 ) 

4 = •s* ® 4-1 ® 0-2 ® 4-2- (4) 



With different choices of Tq,Ti and T 2 the correlation properties of the system 
become different. This shall be analyzed in section 5. 



4.2 Correlation Analysis of the Bluetooth Combiner 

The memory in Bluetooth has four bits, two bits for each two consecutive time 
steps t and t — 1. The function, which is used to form a new term zt of the keys 
stream, is linear. The non-linearity is gained from the function /i, which is used 
to calculate s* . As argued in [3] , and in more general terms in [2] , there remain 
always some correlations in such a system. They shall be analyzed next. 
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The analysis exploits the correlations of the form 



c{w ■ St+l,U ■ Xt®V- Ct) 

where u G GF(2‘*) and v = (v^,v^) G GF(2^). Different choices of u and v 
correspond to different linear combinations of xl,x^,xf,x^,cl and c°. In the 
following Table 1 all the correlations are presented. 
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Table 1. The correlations for and 0 



We note that, since the system is symmetric with respect to each x^, only the 
Hamming weight of u is of importance. We also see that for the correlation 
is zero, if v\ = 0. Next we present derivation of the strongest correlation relation 
we found within the Bluetooth combiner. 

Add c °_3 to the both sides of (4) and rearrange the terms to get 

C* ® © Cj_3 = Sj © Cj_2 © Cj_2 © Cj_3. (5) 

Next we use Theorem 3 to get 

c(c? © C ?_1 © C?_ 3 , 0) = c(s?, c ]_2 © C ?_2 © C?_ 3 ) 

= X! c(s°,WCt_l)c(wCt_i,C°_2©Ct_2©C°_3), 
wGGF{2^) 
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with M = 0, ?; = (0,0, 1, 1, 1,0) and y = (s?_i, 4-i, c?_ 2 , c?_ 3 , 4_3). Now 

from Table 1 we know, that the terms of the sum are zero for ru=(0,l),r(; = 
(1, 1) and w = (0, 0). Only the term with w = (1, 0) remains. So, the correlation 
equation is simplified to 

c(c°0c°_i 0 c°_ 3 ,O) 

= c(sj , Cj_2 0 Cj_2 0 Cj_3) 

= c{Sf , cl_i)c{sl_i, Cf_2)- 

Here the last equation is obtained by moving back in time for one step in equation 
(3), so that 

Ct-i = Cj-i ® sl-i ® C?_3- 
Using the values of Table 1, we finally get 

c(c? 0 c?_i 0 c?_ 3 , 0) = i (6) 

After this we notice that 

4 4 4 

Zt 0 Zt-l 0 Zt-3 = ^ xl 0 ^ xl_^ 0 ^ Xt_3 0 C° 0 0 C°_3- 

11 1 

We conclude by equation (6) that 

4 4 4 ^ 

c{zt 0 Zt-i 0 Zt-3,^xl 0 ^x\_^ 0 ^a;J_ 3 ) = (7) 

11 1 

Since the output function of the Bluetooth combiner is XOR, it is maximum 
order correlation immune. Hence divide and conquer attacks in their standard 
form are not useful for determining the initial states of the LFSR’s. In section 6 
it is shown how the achieved correlation relation can be utilized to determine a 
theoretical upper-bound of the level of the security of the Bluetooth combiner. 



5 Alternative Mappings for Mixing the Carry Bits 



The goal of this section is to investigate, how the choice of the mapping T affects 
the correlation properties of the Bluetooth combiner. In particular, we show that 
the mapping T can be selected in such a way that more than two linear approxi- 
mations are needed when establishing a correlation relation between consecutive 
carry bits. 

Our method exploits a matrix which makes it possible to consider all possible 
linear approximations of the function /i simultaneously. 

Let To, Ti and T 2 be arbitrary 2x2 matrices: 
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Then 



Ct — ToSt © TiCt-i © T2Ct-2- 



( 8 ) 



We write Ti = A(B B. Here the analyst can choose A and B in which way ever 
is convenient, as long as their sum is T\. The equation (8) can be written as 

Ct = T()St © Act-i © Bct-i © T2Ct-2- (9) 

We perform one iteration by inserting equation (8), applied for t—1 instead of 
t, to the equation (9) and get 

Ct = T()St © Bct-i © AToSt-i © ATiCt-2 © AT2Ct-s © T2Ct-2- ( 10 ) 



Now, A may be chosen. Let Dhe & matrix of the form 



D = 



(di d2\ 

dj- 



As the analyst wishes to minimize the number of correlation approximations, she 
wants c° not to depend on cl_^. Therefore, she chooses AT 2 = D. If we assume 
that T 2 is invertible then such a choice is always possible. Inserting B = Ti® A 
into equation (10), as well as A = DT^^, we have 

Ct = ToSt © (Ti © DT2^)ct-i © {DT^^To)st-i © {DT^^Ti © T2)ct-2 © Dct-3-{H) 



In order to take the correlation approximations into consideration, we write them 
in matrix form as 



'^t — 



where 



A* = 



e? ef 



and St and Ct-i are taken as vertical vectors. 

We see from Table 1, that if = 0 the correlations for 

c(s?, e? • cl_i © et ■ c?_i © u ■ Xt), 



are always zero. Therefore we can presume = 1. The choice of u does not 
affect the best non-zero values of the correlations. Therefore, we shall drop u ■ Xt 
and merely study the combinations of s\^i and c^. 

We approximate twice by inserting st = XtCt-i into equation (11), which 
yields 

Ct = (To W © Ti © DT^ ^ )ct - 1 © {DT^^ToXt - 1 © DT ^ ^ Ti © T 2 )ct -2 © Hct-s • (12) 

In Bluetooth the generated key-sequence Zt does not depend on c} but merely on 
c°. Hence, similarly as above in section 4.2, we aim at establishing a correlation 
relation between zero components of Ct- 
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Theorem 6. Let in the Bluetooth eombiner generator Tq = T\ = I and T 2 an 
arbitrary invertible 2 x 2 matrix. If t^ = 1, then two correlation approximations 
suffices to establish a correlation between the input and output. 

Proof. Substitute Tq = Ti = I and the general form of T 2 into (12) and obtain 
the following correlation relation for c° : 

c° = (1 0 d 4 , 0 1 0 ^ 4 ^ 2 ) • ct_i 

0 {d4e}_i 0 ^4^2 0 ^4 0 1 , d4e^_i 0 d4tlef_i 0 ^ 4^2 0 ^ 2 ) ' Ct -2 

0 d4Cj_3 

To have only two approximations means that there must be neither cl_i nor 
Cj _2 in the equation above, i.e. 

l0d4 = O (13) 

and 

d4ej_4 0 (^4^2 ® 0 1 = 0. (14) 

If d 4 = 0, then the other approximation will cancel out, so that equation (11) 
transforms into the initial equation 

~ ^ ^t-1 ^ ^t-2 ^ ^t-2' 

As this is not what the analyst wants, she chooses ^4 = 1 and equation (13) is 
true. From (14) we then get that 

el_i ®tl=0. 

Now we check from Table 1 that c(s^_i,c °_2 0 u ■ Xt-f) 0 for some u. Hence 
it is possible to use this correlation if ^2 = 0- Similarly, if ^2 = 1) we see that 
el_i = 1 is possible, as c(Sf_i,cl_2 0 • c °_2 0 u ■ Xt-2) 7 ^ 0 for some choice of 

u and v^. □ 

In the case of the initial choice of Bluetooth T 2 , we have = 1- So, as we 
saw earlier in section 4.2, only two iteration approximations are needed. The 
approximation matrices Xt in the case of ( 6 ) were 




From Table 1 we see that c{sl_i, cl_ 2 ) = |. Hence 




with ^2 = 1 would have been still a weaker choice than the current T 2 in Blue- 
tooth. Next we show that a stronger choice would have been possible. 

Theorem 7. Let t^ = 0- Then at least three approximation rounds are needed. 
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Proof. If ^2 = 0: and T 2 is invertible as assumed, then = 1- Also T 2 ^ = 

T 2 . Let 



D = 



(di d2\ 
yds d4 J 



Then, as in (12) we have 



ct = (At © 7 © DT2)ct-i © {DT2Xt-i © LTsAt-i © DT 2 © T2)ct-2, 



and further, 

C° = (1 © da, 6 t © 1 © ^2*^3 © ^ 4 ) • Ct_l 
© [dset-i ® e?_i(tid3 ® ^ 4 ) ® da, 

daCt.i ® et_i(tid3 © d4) © tfda © d4 © 1] • ct_2 

© (da, d4) • ct_3 

Now, if da = 1, then we have Ct_a in the equation, so we need to do at least one 
more approximation, hence two approximations is not enough. If da = 0, then 
c\_i is within the equation of and again more than two approximations are 
needed. □ 



An example of a matrix T 2 , which requires at least three approximations to 
get correlation relations between the carry bits c° from different time instances, 
is T 2 = 7. We consider, for example, the correlation between and c^_ 4 - The 
correlation relation could involve some Xt variables at appropriate moments t, 
but in what follows we restrict to the case where the Hamming weight of u is 
always zero. 

Corresponding to the equations (3) and (4) we now have 
4 = 4 © c^_i © cl_2 
4 — 4 ® 4-1 ® 4-2- 

Since no Xt-i is involved, we have by Theorem 3 and with the aid of Table 1 

c(4, 4-4 = 44, 4-1 ® 4-2 ® 4-4 

= X! 44, w ■ ct-i)4w ■ ct-i, 4-1 ® 4-2 ® 4-4 



= 44, 4_444_j^, 4-1 ® 4-2 ® 4-4- 

The second equality follows from Theorem 3, and in the third we noted that the 
only non-zero correlation for c(s° , w ■ Ct-i) is due to w = (1, 0). We continue in 
the same manner to obtain: 

44,4-4 

1 



= -Tc(St_l © S?-l, 4-2 ® 4-3 ® 4-3 ® 4-4 



4 

4 ^ 



c(4-l ® 4-l,W • Ct-2)4w ■ Ct-2),4_2 © c, 



t-3 ' 






'4-4) 



= -^(c(4-i ® 4-1, 0)44-2 ® 4-3 ® 4-3 ® 4-4, o) 

+ 44-1 ® s?-i, 4-2 ® 4-2 )c(4-2, 4-3 ® 4-3 ® 4-4))- 



(15) 
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From the two terms inside the brackets only the last one is non-zero. To see this 
we calculate the first term 



c(c}-2 ® 4-3 ® 4-3 ® 4-4> 0) 
= c(sl-2^ 4-3 ® 4-4 ® 4-4) 



^ c(4-2) W ■ Ct-3)c(w ■ Ct-3, C °_3 ( 



' 4-4 ' 



4-4) 



“ c(Sj_2, Cj_3)c(Cj_4 0 Cj_4, 0) 0 c(Sj_2, C(_3)c(Cj_3, C°_3 0 C, 



t-4 ' 



° i 

t-4j- 



In the last equality the first term in the sum is zero, since the memory bits on 
the same moment are assumed to be statistically independent. We continue with 
the second part of the sum: 

c(c °_3 0 C^_ 3 , c)_4 0 C°_4) 

= c(sj_3 0 S°_3, Cj_5 0 C°_5) 

= ® ®?-3: • Ct-4)c(w ■ Ct-4, c)_5 0 

W 

= -^(44-3 ® 4-3> 0)44-5 ® 4-5> 0) 

0 C(s)_3 0 S?_3, c)_4 0 C?_4)c(c)_4 0 C?_4, c]_^ 0 
= • ^C(c)_4 0 C?_4, c)_5 0 C?_s). 

These calculations are easy to generalize to any moment t — j, j = 0,1,2,..., so 
actually we have 

c(4-3 ® 0-3) 0-4 ® 4-4) 

c(4-3-fc ® 4-3-fc) 4-4-fc ® 4-4-fc)- 

Hence, infinitely many approximations should be done, and so the correlation 
can be regarded as zero: 

c(c?_3 0 c)_3, c)_4 0 c?_4) = ^li^ (“^) = 

Let us now return to the equation 15. We have, that 

44,4-4) 

= ■ ^c(c°_2, c)_3 0 C°_3 0 C°_4) 

= c(s? 2)4 3) = ) 

which is significantly smaller than j® in equation 6. In this manner, the correla- 
tion can be calculated for other weights of u, too. The resulting values degenerate 
to a single product, as above, so that the product of the correlations is always 
significantly smaller than j®. 
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The Bluetooth combination generator is a strengthened version of the basic 
summation bit generator. By increasing the size of the memory the correlations 
have been reduced. We have shown that with the same memory size, by making 
a small modification in the memory update function, it would be possible to 
further reduce the correlations. 

6 Ultimate Divide and Conquer 

In this section it is shown that divide and conquer attack becomes possible if 
the length of the given keystream is longer than the period p of the shortest 
(say, the first) LFSR used in the key stream generation. Assume that there is a 
relation with a non-zero correlation p between a linear combination of the shift 
register output bits 

{u^xl 0 ... 0 0 ... 0 0 ... 0 J 

and the key stream bits 

WoZt 0 WiZt-l 0 ... 0 WdZt-d, 

over a number of d 0 1 time steps. Then it follows by Corollary 5 that we have 
a correlation relation between a linear combination of the keystream bits 

Wo{zt 0 Zt+p) 0 Wi{zt -1 0 Zt+p-l) 0 ... 0 Wd{zt-d 0 Zt+p-d) 
and a linear combination of the LFSR output bits 
ulixt ® Xt+p) 0 ... 0 0 0 . . . 

®ul(x'^_d 0 X?+p_d) 0 ... 0 K(Xt_d 0 x'^+p_d), 

where the output bits from the first (the shortest) shift register cancel, since 
they are equal. 

By Corollary 5, the strength of the correlation over the period p is at least 
. Further, Corollary 5 shows how this lower bound can be improved. We state 
this result in a form of a theorem as follows. 

Theorem 8. For a combination generator, assume that we have the following 
correlation 

c{woZt 0 WiZt-l 0 ... 0 WdZt-d, 

{ulxl 0 ... 0 u^x^) 0 ... 0 {u^dxl-d ® . . . ® u^x^_d)) = p^O. 

Let the lengths of the registers he Li, . . .,Ln and the periods pi, . . . ,p„. Then 
given a keystream of length piP2 ■ ■ ■ Pk + -^ + d one can do exhaustive search over 
the Lk+i + . . . + Ln bits which form the initial contents of n — k registers. 

If the LFSR registers have primitive feedback polynomials, then pi = 2^* — 1. 
In most applications n is even and the lengths Li are about the same. Then given 
a sufficiently strong correlation between the input bits and the output bits of a 
combination generator, the complexity to determine the complete initial state of 
length L is about 0(2^/^). In other words, by generating key stream of length 
0(2^/^) one can successfully carry out exhaustive search over L/2 bits of the 
initial state. 




28 



Miia Hermelin and Kaisa Nyberg 



6.1 Periodic Correlations in Bluetooth 

Computation of the correlations for the Bluetooth combiner is somewhat 
complicated due to multiple iteration. We make use of the relation c° + + 

c °_3 = 0, see (6). Applying Theorem 3 we get 

c(c° 0 C^_i © C °_3 © © C°^p_ 3 , 0) 

= C(S° © S°+p, C°_2 © C^_2 © C°_3pluSC°+p_2 © C^+p_2 + C°+p_3) 

= ^ c(s°©s°+p,w-ct_i©w'-ct+p_i) 

u),«i'eGF(22) 

■c{w ■ Ct -1 © w' ■ Ct+p_i, C ?_2 © c ]_2 © C ?_3 © C^t+p-2 ® c\+p-2 ® C?+p_3)- 
Now we apply Theorem 4 to the first correlation in the product and get 

c(s? © S?+p, W ■ Ct -1 © w' ■ Ct+p-l) 

= ^ c{s^,w ■ Ct-1 (B u ■ x)c{s^,w' ■ Ct-1 (B u ■ x). 

«eGF(22) 

Here x has one, two, or three coordinates, depending on whether p is the least 
common period of of one, two, or three LFSRs, respectively. 

Let us now consider the case where p is the least common period of two 
LFSRs. From Table 1 we see that these correlations are nonzero if and only if 
u = (0, 0) and w = w' = (1, 0), or u = (1, 1) and w = w' = (1, 0), or m = (0, 1) 
and w = w' = (1, 1), or finally, u = (1, 0) and w = w' = (1, 1). 

The value w = w' = (1,1) leads to a longer correlation relation extending 
over at least two rounds, and hence are expected be of less in amount, but still 
non-negative. Therefore, we discard the corresponding terms, and get a lower 
bound to the correlation from the remaining terms with w = w' = (1,0) as 
follows 



c(c° © C °_1 © C ^3 © C?+p © C°+p_i © C°+p_ 3 , 0) 

> (c(s?, + c(s?, c]_^ © x] © x^f) 

■c(Ct_i © c]^p_i, Ct-2 © Cj_2 © C°_3 © Ct^p_2 © Cf^p_2 © Ct^p_g) 

= (c(s?,c^_i)2 + c(s°,ci_i©a;i©a;?)2) • c(4-i, c^-2 u ■ xf 

uGGF(2^) 

= ((_1/4)2 + (i/4)2)(i/4)2 = 2-^ 

using the correlation values given in Table 1. 

It should be stressed, however, that the presented ultimate divide and con- 
quer attack is of theoretical nature, and practical only if the analyzer is given 
access to key stream extending over periods of partial input. For example, the 
Bluetooth Eq algorithm in its intended use generates only short segments of 
keystream to encrypt each plaintext frame starting from a new independent 
initial state. 
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7 Conclusions 

We have seen how the correlations in the Bluetooth combiner could be reduced 
by making a small modification in its memory update function. This improve- 
ment is, however, rather theoretical in nature, but quite interesting as such. The 
methods used in finding this modification are specific to Bluetooth, but could 
be easily adapted to other similar combiner generators. The technique involves 
a matrix describing potential approximations based on known non-zero linear 
correlations over the non-linear part of the memory update function. 

We also showed how any significant correlations over a combiner can be used 
to launch a divide and conquer attack against any combiner generator provided 
that sufficient amount of the output keystream is given. If the input to the 
combiner is produced using a certain number of LFSRs with primitive feedback 
polynomials, and the number of bits of the total initial state is L, then the 
complexity of this attack is upper bounded by 0(2^/^). This will require the 
amount of same magnitude 0(2^/^) of the output bits. 

We conclude that if the effective key length of a combiner generator is re- 
quired to be about the same magnitude as the size of the initial state, then 
the usage of the generator must be restricted in such a way that the length of 
any keystream block ever produced by this generator never exceeds the shortest 
period of the input sequences. 
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Abstract. Most digital cash systems have the identity revelation capability under 
the condition of doublespending, bu t the capability may be misused as a fram- 
ing tool by Bank. We present a method that provides both the identity revelation 
capability and the framing prevention property. 

Keywords : Cryptography, Digital Cash, Zero-Knowledge Interactive Proof, 
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1 Motivation 

Digital cash is the most versatile tool for the electronic commerce, and it may be an 
alternative of the paper money. Chaum introduced quite a new concept of digital money 
in his paper [3]. Besides its conveniences in its usage, Chaum’s digital cash provides 
its user anonimity. In 1993, Brands designed an efficient off-line cash system, which 
includes not only all merits that Chaum’s system has, but also other good functionalities 
such as the framing prevention [1], 

One of the most intriguing properties that Brands’ scheme has is that User is compu- 
tationally protected against Bank’s framing, or Bank cannot compute a proof of double- 
spending if User follows the protocols and does not double-spend. Anonimoty control 
and concern about Bank’s framing attempts has led researches such as [5], [6], [7] and 
[10]. But, they did not treat the following problem: what if User spends a coin more 
than twice? Does the fact guarantee that User is guilty for n(> 2)-spending? When 
Bank insists that n(> 2)-spending occurred in a digital cash system, there are actually 
two possibilities. One is Bank’s framing of User. Since Bank knew already user’s se- 
cret when double-spending occurred, it is possible to make another illegal copy of the 
double-spent coin, if needed, with the collusion of Shop. The other possibility is that 
User actually spent one coin more than twice. Besides n(> 2)-spending problem, ui 
revealed may be misused by Bank in various ways(e.g. Bank can withdraw digital coins 
using Ml). 

This unavoidable problem comes from the lack of the user’s secrecy that would not 
be revealed even when she double-spends a coin. If she has a key that would maintain 
its secrecy after the double- spending revelation and the key is required to perform the 
payment protocol, User cannot assert Bank’s framing while she spent a coin more than 
twice. Owing to the existence of the secret that only User knows, Bank can make Judge 
sure that n(> 2)-spending has really occurred by User, not counterfeited by Bank itself. 
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However, when double-spending occurs, the maintenance of secrecy or framing preven- 
tion looks incompatible with the identity revelation. In this paper, we solve the problem 
by introducing another secret into Brands’ scheme and using the normal 3 -move zero- 
knowledge interactive proof(ZKIP). 

2 Features on Brands’ Scheme 

First, we briefly describe Brands’ scheme. Refer to [ 1 ] for further understanding. 

- The setup of the system: Bank generates at random a generator-tuple (5,51,52) 
and a number x Gu Z*. Also, she chooses two collision-intractable hash functions 
H, Hq. H is used for the signature generation/verification and Hq is for the compu- 
tation of challenges. The generator-tuple and two hash functions are her public-key, 
and X is her private-key. 

- Opening an account: User’s identity is / = 5“^52(User is computationally pro- 
tected from Bank’s framing) or / = 5“^ 52^ (the framing attempts of Bank have 
negligible probability of success, regardless of computing power). Because the case 
of / = 5 i^ 52^ is easily adapted from the case of / = 51^52, we proceed for the 
latter case. 

User generates randomly u\ and computes I = 5i“U If Ig2 ^ 1 , User 

transmits I to Bank, and keeps u\ secret. Bank keeps I with User’s identifying 
information in her database. Bank computes z = (Ig2)^, and transmits it to User. 

- The withdrawal protocol: After proving ownership of her account. User perform 
the following withdrawal protocol. 

1 . Bank generates w Gr Zq, and sends a = g~^ and b = {Ig2)^ to User. 

2 . User Generates s Gr Z*,xi,X2 Gr Zq, and computes A = (152)®, B = 
5“^ 52^ and z' = z^. User generates u,v Gr Zq and computes a' = d^g"" , b' = 
b^'^A^ . User then computes the challenge c' = H{A, B, z' , o', &'), and sends 
the blinded challenge c = d ju mod q to Bank. 

3 . Bank sends the response r = cx A w mod q to User, and debits the account of 
User. 

4 . User accepts iff 5’' = h'^a and {Ig2Y = z'^b. If this holds. User computes 
r' = ru + V mod q. 

Now User has a coin that looks like (A, B, Sign{A, B) = {z' , a' , b' , r')). 

- The payment protocol: 

1 . User sends (A, B, Sign{A, B)) to Shop. 

2 . If A 7^ 1 , then Shop computes and sends the challenge d = Ho{A, B, Is, 
date /time). 

3 . User computes and sends the responses r\ = d{u\s) + x\ mod q and X2 = 
ds + X2 mod q. 

4 . Shop accepts iff Sign{A, B) is valid, and gY gY = A'^S. 

- The deposit protocol: Shop performs the same procedure as that in the payment 
protocol with Bank. Bank accepts iff Sign{A, B) is valid, and gY dY ~ iYB and 
A has not been stored before. If A is already in the deposit database, Bank can hnd 
the double-spender’s identity by computing 5^^^ '’i)/(’’2 r^) ^ 
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The role of u\ in Brands’ scheme is both to identify each user’s coins when douhle- 
spending occurs and to prevent Bank from framing User on double-spending. Thus, the 
revelation of ui means not only double-spending detection, but also the possibility of 
Bank’s framing. After Bank’s double- spending detection. Bank comes to know ui and 
s. With ui and s, Bank can make another forged transcript of dealing with the double- 
spent coin. In this way. Bank can frame n(> 2)-spending on User for an arbitrary 
n. Besides that, ui revealed may be misused by Bank in various ways(e.g. Bank can 
withdraw digital coins using ui). If needed. Bank may ask Shop to accept the forged 
transcript of dealing as true in front of Judge. Confronted with this case. User has no 
way to insist that she has spent the coin not n(> 2) times, but only twice. 



3 Protocol Design Concept 

As mentioned before, ui, the one secret that User has plays two roles in Brands’ 
scheme. With only one secret, we cannot obtain both the identity revelation and the 
framing prevention when double- spending occurs. As a natural consequence, we sepa- 
rate ui into ui and U 2 such that each secret might play only one role. That is, ui reveals 
User’s identity and U2 prevents Bank from framing User on n(> 2)-spending when 
double- spending occurs. U 2 would not be revealed even when User double-spends a 
coin. Separation of ui into ui and U 2 according to their roles enables us to achieve both 
functionalities. 

Note that Brands’ payment protocol does not include the commitment move by the 
prover, whereas 3-move ZKIP systems such as [4,8] include the commitment stage. In 
that point of view. Brands’ scheme separates the commitment stage from the payment 
protocol and includes it in the withdrawal protocol. As known, committed values must 
be different whenever the prover proves its knowledge in the 3-move ZKIP. If the same 
committed value is used again, the verifier can easily compute the prover’s secret. The 
usage of the same committed value during the proof of the same knowledge in ZKIP 
corresponds exactly to the double- spending of the same coin in Brands’ setting. And 
the exposure of the secret in ZKIP corresponds to the exposure of User’s identity in 
Brands’ scheme. 

The knowledge of the first secret(ui) is proved in the same way as that of Brands’, 
whereas the proof of knowledge of the second secret(u 2 ) is performed by the normal 
3-move ZKIP. This setting allows that ui will be revealed but U 2 will be kept secret 
whenever double-spending occurs. Thus, the payment protocol now, looks the normal 
ZKIP for U 2 - 

Our approach has another merit that User does not need to perform the account 
opening procedure again. Unlike Brands’ scheme, we can keep using the revealed ui 
together with U 2 - Suppose that ui is known to Bank before the withdrawal protocol 
starts owing to the previous double-spending. After the withdrawal ends. Bank cannot 
compute A = {Ig 2 Y without knowing s, which is randomly and secretly selected by 
User. That is. Bank’s knowledge of u\ does not help Bank to know what the blinded 
message A of m = Ig 2 is. Consequently, Bank cannot link User with a coin that is 
returned after the deposit protocol. Even if Bank tries to do exhaustive search for every 
user’s identity with coins after the deposit protocol, she cannot link User’s identity 
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with a one-spent coin (ri, T2, A, B, sign{A, B),Is, date/ time). It can be easily seen 
as following: 



ri = (i(wis) -I- xi mod q 
r2 = ds + X2 mod q 

In the above simultaneous equation, there are four unknown variables s, xi,X2,u\. 
Thus, even when Bank correctly guesses u\, there are infinitely many solutions of the 
equation and she has no way of validating of her guess. However, Bank may be able to 
frame User by making a forged transcript by the known ui unless User has the unknown 
U 2 - 

4 Privacy Enhaced Digital Cash System 

Our digital cash system consists of system setup, opening an account, the withdrawal 
protocol, the payment protocol and the deposit protocol. All the conventions are the 
same as that of Brands’ one. As mentioned already, we augment Brands’ scheme such 
that it has two secrets for User and the added secret is treated as the secret of the normal 
3 -move ZKIP technique. 

- System setup: This part is the same as that of Brands’ one, but domains of two 
hash functions H and Hq are corrected such that it may accommodate the added 
functionalities. 

H : Gq X Gq X Gq X Gq X Gq X Gq X Gq X Gq ^ Z* 

Ho : Gq X Gq X Gq X Gq X SHOP -ID X DATE /TIME Zq 

- Opening an account: Besides u\. User has another secret U2 and corresponding 
account information l2- Also, Bank must return account information restricted by 
X, the Bank’s secret. Only adding another secret U2 is different from Brands’. 

h = g/\l2 = 9T 
= (.^152)"^, Z 2 = {1292)^ 

- The withdrawal protocol: User blinds account information by si and S2 that is 
related to u\ and U2, respectively. However, the number that will be used as a 
commitment for U2 in the payment protocol is not prepared in this phase. Instead, 
the commitment will be generated during each payment. After the completion of 
the withdrawal protocol. User has a coin (A, B, G, Sign{A, B, G)). 

1 . Bank generates w €_r Zq, and sends a = g^ and b\ = {Iig2)^,b2 = 
(/2<?2)“to User. 

2 . User Generates si, S2 &r Z*, x\, X2 &r Zq, and computes 

A = {h92)^\B = gl^gT,C = {1292)^^ and ^ , 4 = 4 ^- 

User generates u, v ^R Zq and computes 

a' = a^g\ b[ = “A^ b'2 = 
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User then computes the challenge 

d = H{A, B, C, z[, 4, a, b[, 4), 

and sends the blinded challenge c = c'/u mod q to Bank. 

3. Bank sends the response r = cx + w mod q to User, and debits the account of 
User. 

4. User accepts iff 



4 = h°a and (/152)’' = (^252)’' = 4^2 

If this holds, User computes 



r' = ru + V mod q 

Proposition 1. A and C(the blinded numbers of I\g2 and 1292 , respectively) and 
their signatures are unconditionally untraceable to any specific execution of the 
withdrawal protocol. 

Proof Sketch: Given any h 4 for ®^oh pair (2I, B, C, Sign{A, B, C) = 
(z{, 4) o', b'l, b'2, r')) and the information that Bank gets during the execution of 
any protocol in which User accepts, there are exactly q possible random choices of 
sets (si, S2, t, u, v) that could have been made by User that bring about the link. □ 

Proposition 2. Under the discrete logarithm assumption, there is no way that has 
noneglibible probability of success for User such that she ends up with a pair 
{A, B, C, sign{A, B, C)) for which she knows two different representations of 
{A, B, C) with respect to {91,92)- 

Proof Sketch: Refer to Proposition 12 in [2]. □ 

Proposition 2 means that User cannot double-spend a coin without changing the 
internal structure of the coin, that is the account information. With the propositions, 
we obtain the following assumption. 

Assumption l.The withdrawal protocol is a restrictive blind signature protocol 
(blinded numbers are A and C ) with blinding invariant functions IV\ and IV2 with 
respect to {91,92) defined by IVi{ai, 02) = IV2{ai, 02)01/02 mod q. 

Now, Bank’s attempt to link a coin with User is always frustrated by the following 
proposition. 

Proposition 3. Under the discrete logarithm assumption, Bank’s prior knowledge 
ofui does not help her compute the coin. 

Proof: It’s trivial since si, S2,xi,X2 are chosen secretly to make a coin {A = 
(/i^2)®S B = Pi'g'ff' , C = {1292)"’^, sign{A, B, C)) by User. Bank cannot com- 
pute them. □ 
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The payment protocol: User can prove the knowledge of the representation of 
A, B, C with respect to gig2 with the payment protocol. A, B is for the knowledge 
proof of Ml, and C is for that of U2 - In the protocol, D = g^eah gdeah corresponds 
to the commitment numbers of the normal 3 -move ZKIP. 

1. User generates deali, deah &r Z* and computes 

/-) deali deal2 

^ — yi y2 

User sends {A, C, D, Sign{A, C)) to Shop. 

2. If A ^ 1 , then Shop computes and sends the challenge 

d = Hq{A, B, C, D, Is, date/time) 

3. User computes and sends the responses 



ri = d(MiSi) + x\ mod q, T2 = ds\ + X2 mod q 

T3 = d{u2S2) + deoil mod q, = ds2 + deal2 mod q 

4. Shop accepts iff Sign{A, B, C) is valid, and 

gl^g^^ =A‘^B,g[^g;^ =C‘^D 



Proposition 4. During the payment protocol. User does not reveal more informa- 
tion than Brands ’ does. 

Proof Sketch: The information for the proof of knowledge of representation of A 
with respect to gig2 is the same as that for Brands’ . Added information is for the the 
proof of knowledge of representation of C, but it is Schnorr’s identification scheme 
and does not reveal any information on «2 ■ ^ 

The deposit protocol: First, Bank checks the validity of the coin by scrutinizing 
the signature and the transcript submitted by Shop. Double-spending is examined 
by comparing {A, C) with entries in Bank’s database. If found match. Bank can 
extract account information by computing 

T _ „Ui-r'i)/(r2-r2) 

B — 9i 

Since deali ^ deal'i and deah ^ deal'2, g^{^ ’'3)/(’'4 r^) reveal I2 even 

if double- spending occurs. Because Bank cannot compute any forged transcript of 
dealing with the double-spent coin without knowing U2, User is computationally 
protected from Bank’s framing attempt after the double-spending. 

1 . Shop sends the transcript 



{A, B, C, D, date/ time, Sign{A, B, C), ri, X2, r^, rf) 




36 



DaeHun Nyang and JooSeok Song 



2. Bank accepts iff Sign{A, B, C) is valid, and 

=A^B,g{^gl^ =C^D 

and {A, C) has not been stored before. If (A, C) is already in the deposit 
database, Bank can find the double-spender’s identity by computing 

9i 

After the double-spending is detected, there are possibly two choices regarding the 
use of (mi, M 2 ). One is that User throws away (ui, U 2 ) and performs the whole pro- 
cedure of account-opening again, and the other is that User keeps using (ui, U 2 )- As 
mentioned before, the revelation of ui does not give any hint for Bank to link a coin 
with User’s identity. Thus, (ui, U 2 ) can be re-used without any modification. This fea- 
ture is very useful, since it eliminates the need of re-opening an account and updating 
the database of Bank. 

5 Conclusion 

We presented Bank’s framing problem that might arise after double-spending in digital 
cash systems, and propose a digital cash system that satisfies both the double-spending 
detection capability and the framing prevention property. The idea is that our digital 
cash system uses two secrets: one is for double-spending detection and the other would 
not be revealed even when double-spending occurs. Though we present a digital cash 
system based on Brands’ scheme, the idea may be applied to existing and forth-coming 
digital cash systems. 

In summary, our digital cash system has the following properties: 

1. If n(> 2)-spending occurs, User cannot deny it. 

2. Bank cannot frame User on n(> 2)-spending, while User only double-spends(n = 

2 ). 

3. One-spent coin does not reveal User’s identity. 

4. After the double-spending detection, account-reopening procedure is not needed. 

Though we did not present the observer-based cash protocol, it is easy to augment 
the proposed protocol such that it might have that pre-restriction capability on double- 
spending. Other than that, our technique can be easily embeded to recently published 
digital cash system in [5]. 
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Abstract. In this works we examine the diffusion layers of some block 
ciphers referred to as substitution-permutation networks. We investigate 
the practical security of these diffusion layers against differential and lin- 
ear cryptanalysis by using the notion of active S-boxes. We show that 
the minimum number of differentially active S-boxes and that of linearly 
active S-boxes are generally not identical and propose some special con- 
ditions in which those are identical. Moreover, we apply our results to 
analyze three diffusion layers used in the block ciphers E2, CRYPTON 
and Rijndael, respectively. It is also shown that these all diffusion lay- 
ers have achieved optimal security according to their each constraints of 
using operations. 



1 Introduction 

Shannon suggested that practical secure product ciphers may be constructed 
using a mixing transformation consisting of a number of layers or rounds of 
“confusion” and “diffusion” [19]. The confusion component is a nonlinear substi- 
tution on a small subblock and the diffusion component is a linear mixing of the 
subblock connections. 

The Substitution-Permutation Networks(SPN) structure is directly based on 
the concepts of confusion and diffusion. One round of an SPN structure generally 
consists of three layers of substitution, permutation, and key addition. Substi- 
tution layer is made up of small nonlinear substitutions referred to as S-boxes 
easily implemented by table lookup for confusion effect. Permutation layer is 
a linear transformation in order to diffuse the cryptographic characteristics of 
substitution layer. Key addition layer is to implant round subkeys of the cipher 
and the position of this layer is variable according to ciphers. A typical example 
of one round of an SPN structure is given in Figure 1. 

Due to memory requirements, most block cipher designers use small S-boxes, 
e.g. with 4 or 8 input bits. Thus the diffusion of S-box outputs by permutation 
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Substitution layer 



Permutation layer 
(Diffusion layer) 



Fig. 1. One round of an SPN structure 



layer plays a great role in providing immunity against various attacks including 
differential and linear cryptanalysis. 

On the other hand, permutation layers of most modern block ciphers are not 
simple bitwise position permutations or transpositions but linear transformations 
on some vector spaces over various finite fields. Hence in this paper, we call 
permutation layer as “diffusion layer” for the distinctness. 

Diffusion layers of modern block ciphers of SPN structure are linear trans- 
formations on Z 2 over some finite fields such as GF{2) or GF(2") and have 
one-to-one correspondence to appropriate matrix. That is, most diffusion lay- 
ers have appropriate matrix representations. In this work, with these matrix 
representations we study the practical security against differential and linear 
cryptanalysis for the diffusion layers of three AES 1 round candidate algorithms 
E2, CRYPTON and Rijndael. The diffusion effects of the diffusion layers con- 
structed with simple transpositions and an appropriate linear transformation 
were well studied in [7]. However, the diffusion layers of E2, CRYPTON and 
Rijndael are different from that of [7]. 

2 Practical Security against DC and LC 

2.1 Background 

The most well-known method of analyzing block ciphers today is differential 
cryptanalysis(DC), proposed by Biham and Shamir[2,3] in 1990. DC is a cho- 
sen plaintext attack in which the attacker chooses plaintexts of certain well- 
considered differences. Biham and Shamir used the notion of “characteristic”, 
while Lai, Massey and Murphy[ll] showed that the notion “differential” strictly 
refiects the strength of a cipher against DC. Roughly speaking, a differential is 
a collection of characteristics. 

Another method of analyzing block ciphers is linear cryptanalysis(LC), pub- 
lished by Matsui[13] in 1993. The attacks based on LC are known plaintext 
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attacks and the attack on the DES is faster than the attack by DC. The first 
version of LC applied “linear approximation” to an attack of block ciphers, but 
Nyberg[15] has considered a collection of linear approximation, which she called 
a “linear hull”, for strict evaluation of the strength against LC. 

Kanda et al.[8] classified four measures to evaluate the security of a cipher 
against DC and LC as follows: 

— Precise measure: The maximum average of differential and linear hull prob- 
abilities [11,15]. 

— Theoretical measure: The upper bounds of the maximum average of differ- 
ential and linear hull probabilities [14,16,1,9]. 

— Heuristic measure: The maximum average of differential characteristic and 
linear approximation probabilities [2,3,13]. 

— Practical measure: The upper bounds of the maximum average of differential 
characteristic and linear approximation probabilities [10,18,5]. 

DC and LC are the most powerful attacks to most symmetric block ciphers. 
Accordingly, it is a basic requisite for the designer to evaluate the security of 
any new proposed cipher against DC and LC, and to prove that it’s sufficiently 
resistant against them. In this paper, we consider practical measure out of the 
above four measures since practical measure is feasible to evaluate while others 
are not practical. 

2.2 Differentially and Linearly Active S-Boxes 

Let S be an S-box with m input and output bits, i.e., S : Z^. Differential 

and linear probabilities of S are defined as the following definition. 

Definition 1 For any given Ax, Ay, a, b G Z^, define differential and linear 
probabilities of S by 

DP^{Ax Ay) = ■ Sjx) 0 S{x 0 Ax) = Ay} 

and 

respectively, where < a, (3 > denotes the parity ( 0 or 1) of bitwise product of a 
and fi. 

DP^ and LP^ for a strong S-box S should be small enough for any input 
difference Ax 0 and output mask value b Q. So we define parameters repre- 
sent immunity of an S-box and each substitution layer of SPN structure against 
DC and LC as follows: 

Definition 2 The maximum differential and linear probabilities of S are defined 
by 

DP^.,. = max DP^(Ax ^ Ay) 
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and 

= max LP^(a ^ h) , 

a, o^O 

respectively. 

Definition 3 Assume that each substitution layer of a SPN structure consist of 
n S-boxes S\, S 2 , Sn- The maximum differential and linear probability of 
the substitution layer are defined by 

P = fnax , q = rn;K > 

l<t<n l<t<n 



respectively. 

Evaluation of security for a block cipher of SPN structure by practical mea- 
sure begins with the concept of active S-box. 

Definition 4 Differentially active S-box is defined as an S-box given a non- 
zero input difference and linearly active S-box as an S-box given a nonzero 
output mask value. 

By computing the minimum number of differentially and linearly active S- 
boxes, we can evaluate security of a block cipher on the viewpoint of practical 
security against DC and LC [10, 18,5]. We can obtain upper bounds of the maxi- 
mum differential characteristic and linear approximation probabilities from the 
minimum number of active S-boxes. Thus in the case of SPN structure, it is im- 
portant to analyze the increasing amounts of minimum number of active S-boxes 
by considering diffusion layer in consecutive two rounds. 

Note that we can omit the key addition layer to compute the number of 
active S-boxes since this layer has no influence under the assumption that the 
key addition layer is performed by bitwise EXORs. Define the SDS function 
with three layers of substitution-diffusion-substitution for analyzing the role of 
diffusion layer to rise the number of active S-boxes in consecutive two rounds of 
a SPN structure(Figure 2). 

Throughout this paper we assume that all S-boxes in the substitution layer 
are bijective. If an S-box is bijective and differentially/linearly active, then it 
has a non-zero output difference/input mask value[14j. So when all S-boxes in 
substitution layer are bijective, we can define the minimum number of active 
S-boxes of the SDS function. Set diffusion layer of the SDS function as D, input 
difference of D as Ax = x(Bx*, output difference as Ay = y(By* = D{x)(BD{x*), 
and input and output mask value as a and b, respectively. 

Definition 5 The minimum number of differentially and linearly active S-boxes 
of the SDS function are defined by 

f)d{D) = min {Hc{Ax) -\- Hc{Ay)} 
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Substitution 

(S-boxes) 



Diffusion 



Substitution 

(S-boxes) 



Fig. 2. SDS function 



and 

(3i{D) = min{i?c(a) + Hc{b)} , 

f)^0 

respectively, where for each x = {x\, X 2 , ■ ■ ■ , Xn) & Vatj G Z^, component 

Hamming weight of x is defined by 

Hc{x) = #{1 <i<n : Xi^Q} . 



Theorem 1 Let (3d{D) and (3i{D) be the minimum differentially and linearly 
active S-boxes in the SDS function, respectively. Then the maximum differential 
and linear probabilities Psds and qsds of the SDS function hold for 

Psds < and qsds < • 

The above theorem is obtained easily by the maximality of p(or q) and the 
minimality of Pd{D) {or Pi{D)). Evaluation of practical security against DC and 
LC is based on this theorem. 



2.3 Matrix Representation of Diffusion Layer 

Most diffusion layers of modern block ciphers of SPN structure are linear trans- 
formations on Z 2 over some finite fields such as GF{2) or GF(2") and have 
one-to-one correspondence to appropriate matrix. That is, most diffusion layers 
have appropriate matrix representations. If we use this matrix representation 
for a diffusion layer, then we obtain the relationship between input and output 
differences(or mask values) as the following theorem. 

Theorem 2 Assume that the diffusion layer D of the SDS function is repre- 
sented as a matrix M . Then the matrix for relationship between input and output 
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differences is represented as the same matrix M , while the matrix for relation- 
ship between output and input mask values is represented as the transposed matrix 
M*. That is, 

Ay = MAx , a = M*b . 

Proof: This theorem is proven in [4] by using the notion of correlation matrix. 

□ 

It is possible that we compute the minimum number of differentially and 
linearly active S-hoxes{Pd{D) and Pi{D)) of the SDS function by using the matrix 
representation of Theorem 2. However the minimum number of differentially and 
linearly active S-boxes are not identical in general. In the next section, we will 
show that (id{D) Pi{D) by proposing a counterexample. On the other hand, 
the minimum number of differentially and linearly active S-boxes are identical 
for the special types of representation matrix Mo as, the following two theorems. 

Theorem 3 Let the diffusion layer D of the SDS function be represented as nxn 
matrix M. If M is a symmetric or orthogonal matrix, then /3d{D) = (3i{D). 

Proof: By Theorem 2 and Definition 5, Pd{D) and Pi{D) are expressed as fol- 
lows: 



Pd{D) = min {Hc{Ax) Hc{MAx)} , 

AXy^O 

(3i{D) = mm{Hc{M*b) Hc{b)} . 

From this, we can easily see that Pd{D) = Pi{D) if M is a symmetric matrix 
which M* = M. 

Meanwhile, if M is an orthogonal matrix that M~^ = M*, then a = M^b 
implies that b = Ma, and the condition b = Ma 0 is identical to a yf 0 since 
M is an invertible matrix. Thus 

(3i{D) = min{Hc{a) Hc{Ma)} , 
a^O 

and j3d{D) = j3i{D). □ 

Theorem 4 If M* is obtained from M by applying operations of exchanging row 
or column vectors, then (3d{D) = Pi{D), where M is the representation matrix 
of the diffusion layer D of the SDS function. 

Proof: The operation of exchanging row vectors of M results in changing the 
order of components of output difference Ay, and this operation does not affect 
to the component Hamming weight He Ay). On the other hand, it is clear that 
HcAy) is determined by column vectors of M but unconcerned to their location. 
Thus the operation of exchanging column vectors of M also does not affect to 
the component Hamming weight HcAy). Since a row(column) vector oi Mo is 
a column(row) vector of M*, operations of exchanging row or column vectors of 
M doesn’t affect to the component Hamming weight Hffa) also. Therefore, if 
M* is obtained from M by those operations, Pd{D) = Pi{D). □ 
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3 Diffusion Layer of E2 

E2 is a 128-bit block cipher designed by NTT of Japan, which is one of the 
fifteen candidates in the first round of the AES (Advanced Encryption Standard) 
project[17]. Overall structure of E2 is Feistel network as DES and adopt round 
function to what is called “2-round SPN structure” [8] . However 2-round SPN 
structure without key addition layer is exactly equal to the SDS function. 

In the round function of E2, diffusion layer is constructed with just bitwise 
EXORs and expressed as an 8 x 8 matrix with only 0 and 1 entries. Kanda 
et al.[8] described the relationship between the matrix representation and the 
actual construction of the diffusion layer and proposed a search algorithm for 
constructing the optimal diffusion layer. Furthermore, they have shown that the 
round function of Feistel structure with the 2-round SPN structure requires one- 
fourth as many rounds as the “1-round SPN structure”, which is composed of 
one substitution and one permutation layer, to achieve the same differential and 
linear probabilities. That is, the round function using the 2-round SPN structure 
is twice as efficient as that using the 1-round SPN structure. 

Let each S-box of substitution layers have m input and output bits, and the 
number of S-boxes in each substitution layer be n. Assume that inputs of the SDS 
function are linearly transformed to outputs per m-bit and the diffusion layer is 
constructed with just bitwise EXORs. Then the diffusion layer is represented as 
an n X n matrix M which all entries are zero or one as follows: 

n 

2/i = 0 = 0 , 

j=i 

where x = (xi,a; 2 , ••• , x„) G (^™)” is an input, y = {yi,y 2 ,--- ,yn) is the 
output, and M = (y,ij). 

Kanda et al.[8] studied diffusion property of the diffusion layer with this 
matrix representation. Their study was based on the relationship between the 
matrix for differential characteristic and linear approximation. However they 
made two conjectures to unfold their theory. The Conjecture 1 of [8] is correct 
since this is a special case of Theorem 2, but the Conjecture 2 of [8] is a wrong 
opinion. We disprove this conjecture by proposing a counterexample. 

Conjecture 2 of [8]. In the SDS function, the minimum number of differen- 
tially active S-boxes is equal to the minimum number of lineally active S-boxes. 
That is, (3d{D) = (3i{D), where M is the representation matrix of the diffusion 
layer D. 

Counterexample for the Conjecture 2 of [8]: Suppose that the diffusion 
layer of SDS function with n = 4 be represented by the following invertible 
matrix: 



/1 1 1 1\ 




/l 1 0 o\ 


10 0 1 


, M* = 


10 10 


0 10 1 


10 0 1 


\0 0 1 0^ 




[llioj 



M = 
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If Hc{Ax) = 1, then Hc{Ay) > 2 since Hc{Ay) is determined by a column 
vector of M and Hamming weight of each column vector is at least 2. Hc{Ay) is 
determined by the EXORs between any different two column vectors if Hc{Ax) = 
2. Any the EXOR between two column vectors has Hamming weight at least 1. 
Thus the minimum number of differentially active S-boxes is (3d{D) = 3. 

On the other hand, by Theorem 2, relationship between output and input 
mask values is represented as the transpose matrix M* of M. Note that Hamming 
weight of the fourth column vector of M* is 1. Consider the output mask value 
of the form b = (0, 0, 0, 64), 64 yf 0, 



/l 1 0 o\ 








10 10 


0 




0 


10 0 1 


0 




&4 


Vi 1 1 0 ; 


\h,) 







then corresponding input mask value a = (0, 0, 64, 0). From this we can obtain 
that (3i{D) = 2. Consequently we know that (3d{D) yf /3;(-D) for the above 4x4 
matrix M. □ 

As a matter of convenience, we abuse our notation and use (id{M){ov 
for Pd{D) {or Pi{D)) henceforth, where M is the representation matrix of the 
diffusion layer D. In the block cipher E2, designers considered the SDS function 
with n = 8. Kanda et al.[8] suggested a method of determining an 8 x 8 matrix 
M = P yielding the maximum value of (3d{P) using the search algorithm. Using 
this search algorithm, they found that there is no matrix with (3d{P) P 6, and 
that there are some candidate matrices with Pd{P) = 5. Here, we give theoretical 
proof for the fact that (id{P) = 5 is optimal and also that Pi{P) < 5, where P 
is a 8 X 8 invertible matrix. 

Theorem 5 Assume that the number of S-boxes in the substitution layer of the 
SDS function is 8(n = 8). If the representation matrix P of the diffusion layer 
is an 8 X 8 invertible matrix, then /3d{P), Pi{P) < 5. 

Proof: Since P is an 8 x 8 invertible matrix, eight column vectors Pi, P2, • • • , Pg 
are linearly independent. Thus the number of columns with the Hamming weight 
8 is at most one. Note that Pd{P) is closely related to the Hamming weights of 
column vectors of P. We separate the proof into four cases. Here, the Hamming 
weight Hc{Pj) of a column vector Pj is the number of entries with 1 in Pj. 

Case 1 If mini<j<8 Hc{Pj) = 7, for any two column vectors Pj and Pfc, Hc{Pj © 
Pfc) < 2. By considering Ax such that Hc{Ax) = 2, we obtain that (id{P) < 
2 + 2 = 4. 

Case 2 Suppose that mini< j Hc{Pj) = 6. If there exists a column vector with 
Hamming weight 8, then Hc{Ay) < 2 for some Ax such that Hc{Ax) = 2. 
If there exists a column vector with Hamming weight 7, the minimum value 
of Hc{Ay) at most 3, since we can consider the EXORs between column with 
Hamming weight 6 and 7, where Hc{Ax) = 2. At last, if the Hamming weight of 
all column vectors is 6, then although some different four column vectors include 
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0 entries in distinct rows, the Hamming weight of EXORs between one of this 
four columns and another fifth column vector is 2. Consequently, we obtain that 

HP) < 5- 

Case 3 Assume that mini<j<g = 5. If there exists a column vector with 

the Hamming weight 8, then Hc{Ay) < 3 for some Ax so that Hc{Ax) = 2. If 
there exists a column vector with the Hamming weight 7 or 6, by similar analysis 
to Case 2, we can obtain what we want. In the case that the Hamming weight of 
each column vectors is 5, although 0 entries of some different five column vectors 
are arranged optimally, another sixth column vector and one of this five columns 
have in common at least two 0 entries at the same rows. Thus Hc{Ay) is at most 
2 where Hc{Ax) = 2. Therefore Pd{P) < 5 also holds in this case. 

Case 4 Assume that mini<j<g i7c(Cj) < 4. Consider the only Ax such that 
Hc{Ax) = 1. Then we obtain easily Pd{P) < 5, since there exists a column with 
the Hamming weight 4. 

By Case 1 - 4i obtain that Pd{P) < 5 always holds whenever P is 8 x 8 
invertible matrix. On the other hand, by Theorem 2, Pi{P) is related to P*. 
Thus we can also obtain the same result for (3i{P) by considering the Hamming 
weight of row vectors instead of column vectors of P. □ 

We can see that each case in the proof of the above theorem depends only on 
the Hamming weight property of column vectors. Kanda et al.[8] found 10080 
candidate matrices with (3d{P) = 5 by searching algorithm and for all candidate 
matrices, the total Hamming weight is 44 with 4 column(row) vectors of six 
Hamming weight and 4 column(row) vectors of five Hamming weight. One of 
these candidate matrices which is easy to determining construction is used in 
the block cipher E2 as follows: 



/o 


1 


1 


1 


1 


1 


1 


0\ 




/o 


1 


1 


1 


1 


1 


0 




1 


0 


1 


1 


0 


1 


1 


1 




1 


0 


1 


1 


1 


1 


1 


0 


1 


1 


0 


1 


1 


0 


1 


1 




1 


1 


0 


1 


0 


1 


1 


1 


1 


1 


1 


0 


1 


1 


0 


1 


pt 


1 


1 


1 


0 


1 


0 


1 


1 


1 


1 


0 


1 


1 


1 


0 


0 


5 ^ — 


1 


0 


1 


1 


1 


0 


0 


1 


1 


1 


1 


0 


0 


1 


1 


0 




1 


1 


0 


1 


1 


1 


0 


0 


0 


1 


1 


1 


0 


0 


1 


1 




1 


1 


1 


0 


0 


1 


1 


0 


VI 


0 


1 


1 


1 


0 


0 


V 




^0 


1 


1 


1 


0 


0 


1 


1/ 



It is easy to see that the above matrix P* is obtained from P by applying the 
operations of exchanging row or column vectors. Therefore, by Theorem 4, we 
obtain that 

Hp) = Hp)- 

That is, the Conjecture 2 of [8] is correct for the above matrix P. 

4 Diffusion Layer of CRYPTON 

The block cipher CRYPTON[12] is one of the fifteen candidates in the first 
round of the AES. Overall structure of CRYPTON is the SPN structure influ- 
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enced by SQUARE [5] which using round function allowing parallel processing on 
the whole data block. The round function of CRYPTON consists of four paral- 
lelizable steps: byte-wise substitution, column-wise bit transformation, column- 
to-row transposition, and key addition. Out of these four steps the column- wise 
bit transformation and the column-to-row transposition are included in the dif- 
fusion layer. In this section, we concentrate our discussion on the column- wise 
bit transformation. 

The bit transformation tt mixes four bytes in each byte column of a 4 x 4 byte 
array and tt is expressed as tt = (tto, tti, 7T2, tts), where 7Ti is the bit transformation 
of the t-th column. In order to analyze the diffusion effect of tt, it suffices to 
consider ttq since another 7Tj’s(i = 1,2,3) are obtained from ttq by the simple 
byte transposition. 

For any x = {xi, X2, xs, X4) € y = t:o{x) = (2/1, 2/2, 2/3, 2/4) G is 

defined by 

2/1 = (xi A mi) 0 {x2 A m2) © (x3 A m3) © (x4 A m4) 

2/2 = {x\ A m2) © {x2 A m3) © (x3 A m4) © {x^ A mi) 

2/3 = {xi A m3) © {x2 A m4) © (x3 A mi) © {x^ A m2) 

2/4 = {x\ A m4) © (x2 A mi) © {xs A m2) © {x^ A m3) , 

where for a,b G Z|, a A & is the bitwise AND, and 



mi = oxfc = 11111100(2) 
m2 = ox/3 = 11110011(2) 
m3 = oxcf = 11001111(2) 
m4 = 0x3/ = 00111111(2) 



Now we consider the matrix representation of the bit transformation ttq. 
The bit transformation ttq can be implemented by using bitwise EXOR and 
AND logic, and is a linear transformation on the vector space over the 
finite field GF( 2 ). Hence ttq has the unique matrix representation under the 
appropriate basis. Let the standard basis be given for the vector space 
Then the 32 x 32 matrix Q over GF{ 2 ) corresponding to ttq has expression as 
follows: For 1 < t < 4, set 



/m*i 0 00000 0\ 

0 m*2 0 0 0 0 0 0 



Mh = 



\0 0 00000 misj 

where m* = (mii, mj2, • • • , mis), then 

/Mil MI2 MI3 MIA 

_ MI2 M/3 Mh Mh 
^ ~ M/3 Mh Mh Mh 
\Mh Mh Mh Mh) 
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That is, 

(7To(a;))* = Qx* . 

In [12], the author examined the diffusion property of ttq closely by using the 
notion of “diffusion order” . However the definition of diffusion order is exactly 
the same as that of “branch number” used in the former document of block 
cipher SHARK [18]. 

Definition 6 For any transformation L, the branch number of L is defined by 
B{L) = mina^o{Hc{a) + Hc{L{a))} . 

It is shown that = 4 and there are only 204 values among 2^^ possible 

values that achieve the branch number 4[12]. On the other hand, we can theo- 
retically show that B{ttq) < 4 by the similar process of the proof of Theorem 5 
since Q is an invertible matrix. Thus we obtain the fact that the diffusion effect 
of ttq is optimal under the condition that the transformation is consisted of only 
bitwise EXOR and AND logic. 

The branch number of diffusion layer is closely related to the minimum num- 
ber of differentially and linearly active S-boxes of this layer. In the case of ttq, it 
is easily seen that Pd{T^o) = B{tto) since 

TTo{x 0 X*) = TTo{x) © 7To(a;*) . 

Moreover Q is a symmetric matrix((5 = Q*), thus Pd{T^o) = PiiT^o) by Theorem 3. 
Consequently, for the bit transformation ttq in the diffusion layer of CRYPTON, 
we obtain that 

/3d(7To) = fdiiiTo) = B{tto) = 4 

and this is an optimal number of the linear transformation constructed with only 
bitwise EXORs and ANDs. 



5 Diffusion Layer of Rijndael 

Rijndael[6] is a block cipher, designed as a candidate algorithm for the AES. 
Recently, NIST announced the five AES finalist candidates for round 2 [20]. 
The block cipher Rijndael was included in the five AES finalists. The design 
of Rijndael was strongly influenced by the design of the former block cipher 
SQUARE [5]. In this section, we study the diffusion layer which used commonly 
in SQUARE and Rijndael. 

The diffusion layers of SQUARE and Rijndael are consisted of row-wise(or 
column-wise) bit transformations and bytewise transpositions that operate on a 
4x4 array of bytes. The bit transformations that used in the two block ciphers 
are mathematically identical with the exception of the fact that it’s a row-wise 
operation in the case of SQUARE and column-wise in the case of Rijndael. For 
the convenience, we use the notation 6 of SQUARE for the bit transformation. 
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Let ^ = (Co 5 Cl) $25 Cs) G GF(2®)"‘ be 4-byte input of 6 and 0(C) = C = 
(Co) Cl) C 2 ) Cs) G GF(2®)‘* be the corresponding output. Then C and 9 correspond 
to polynomials in GF(2®)[x]. That is, they can be denoted by 

C(a;) = Co + Cia; + C 2 a;^ + Csa;^ 

and 

C(a;) = Co + Cia; + C 2 a;^ + Csa;^ ) 

respectively. Defining c{x) = cq+cix+C 2 x‘^+C 3 X^ we can describe 0 as a modular 
polynomial multiplication: 

C = ^(C) C(2^) = c(2^)C(2;) (mod 1 -I- x'^) . (1) 

The matrix representation of (1) is as follows: 



fCo\ 




^Co C3 C2 Ci^ 






Cl 




Cl Co C3 C2 




Cl 


C 2 




C2 Cl Co C3 




C 2 


\C3/ 




\C 3 C2 Cl Co/ 




^ 3 / 



Let G be the 4x4 matrix of (2), then (2) can be written simply as 

C* = GC* . 

Note that the bit transformation 0 is a linear transformation on the vector 
space (GF(2®))‘* over the finite field GF(2®). It is shown in [5] that if A^{x) = 
C(a:) © ^*{x) is an input difference of 0, the output difference is 

Z\C(a;) = c{x)^{x) © c{x)^*{x) (mod 1 + x'^) 

= c{x)A^{x) (mod 1 + x'^) . 

This can be written equivalently as 

(Z\C)‘ = G(Z\C)‘ (3) 

which is the formula appeared in Theorem 2. From this we obtain that 

/3d(0) = B{9) . (4) 

Let a{x) be an input mask value and h{x) be the corresponding output mask 
value on the view point of LC. Then a{x) and h{x) are satisfied with 

a{x) = c{x~^)b{x) (mod 1 + x'^) 

and (5) can be written as 

ao = Co • 00 © Cl • 6l © C2 • 02 © C3 • 03 

01 = C3 • 00 © Co • 01 © Cl • 02 © C2 • 03 

02 = C2 • 00 © C3 • 01 © Co • 02 © Cl • 63 
«3 = Cl • 00 © C2 • 01 © C3 • 02 © Co • 03 



( 5 ) 
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where means the multiplication on the finite field GF(2®), a{x) = ao + oia; + 
02 ^^ + and b{x) = &o + bix + b 2 x'^ + b^x^. The matrix representation of the 
above formula is 



/ao\ 




( Cq Cl C2 C3^ 




(bo\ 


ai 




C3 Co Cl C2 




bi 


02 




C2 C3 Co Cl 




b2 


yas) 




\Cl C2 C3 CqJ 




{b^J 



and equivalently 



a* = C*b* 



( 6 ) 



that is also appeared in Theorem 2. 

In fact, two matrices C and G* of (3) and (6), respectively, used commonly 
in the block ciphers SQUARE and Rijndael are given as follows: 



/2 3 1 1\ 




/2 1 1 3\ 


12 3 1 


, G* = 


3 2 11 


112 3 


13 2 1 


\3 1 1 2^ 




i^l 1 3 2/ 



At this point, it is easy to see that the matrix G can be obtained from G* by 
appropriate transpositions of row and column vectors. Therefore, by Theorem 4 
and (4), we obtain the fact that 

!3d{e) = Pi{6) = B{9) . 

On the other hand, the branch number B{&) = 5 and this is the maximal 
branch number. In [18] it was shown how a linear transformation on (GF(2’”))" 
optimal branch number B{B = n+1) can be constructed from a maximal distance 
separable code. The polynomial multiplication with c{x) corresponds to a special 
subset of the maximal distance separable codes. 

However the fact that B{9) = 5 also can be shown by the similar methods 
used in the proof of Theorem 5. Since the additive operation of GF(2®) is the 
bitwise EXOR, the Hamming weights of EXORs among column vectors of the 
matrix G are reflected to compute the branch number B{9). 

The bit transformation 9 has the best diffusion effect but its computational 
efficiency is relatively of low grade since the multiplication in GF(2®) is com- 
plicated. The authors of SQUARE and Rijndael insisted that computational 
efficiency of 9 improved by using 

c{x) = 02 -I- 01 • a: -I- 01 • -I- 03 • 

since the multiplication in GE(2®) can be implemented by one bit shift opera- 
tions and bitwise EXORs. There is some truth in this assertion if we consider only 
encryption process since the coefficients 01, 02, 03 of c{x) are relatively small. If 
we consider decryption process, the inverse of 9 is needed and 9~^ corresponds 
to the polynomial 



d{x) = QE -I- 09 • x -I- QD ■ x^ + QB ■ x^ . 
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However the coefficients 09, OB, OD, OE of d{x) are large and computational effi- 
ciency of 6~^ is relatively low. Therefore we obtain that the bit transformation 
of SQUARE and Rijndael has the best diffusion effect but its computational 
efficiency is inferior to another diffusion layers. 

6 Comparison of Diffusion Layers of E2, CRYPTON, and 
Rijndael 

While the diffusion layers of E2 and CRYPTON are linear transformations over 
GF{2), the diffusion layer of Rijndael is a linear transformations over the fi- 
nite field GF(2®). The diffusion layer of Rijndael is the best out of these three 
algorithms on the view point of diffusion effect. But on the view point of com- 
putational efficiency, the order of excellency is E2, CRYPTON, and Rijndael. 
Therefore it is instructive that security hard to be compatible with computa- 
tional efficiency even when we consider only diffusion layer. The result of com- 
paring briefly for the diffusion layers used in the three block ciphers is given in 
the table 1. 



Table 1. Comparison of diffusion layers 



Cipher 


E2 


CRYPTON 


Rijndael 


Diffusion 


P : 8 X 8 matrix 


7T0 : GF{2)^ GF{2f‘‘ 


9 : GF{2’^ f GF{2’^ f 


Layer 


(LT over GP(2)) 


(LT over GP(2)) 


(LT over GP(2®)) 


Operations 


EXORs 


EXORs 

ANDs 


EXORs 

Mul. in GP(2®) 
(Shifts, EXORs) 


II 


5 


4 


5 


(Maximum) 


(9) 


(5) 


(5) 


Diffusion 

effect 


56 % 


80 % 


100 % 



7 Conclusion 

We examined the diffusion layers of some block ciphers referred to as substitution- 
permutation networks. We investigated the practical security of these diffusion 
layers against differential and linear cryptanalysis by using the notion of active 
S-boxes. It was shown that the minimum number of differentially active S-boxes 
and that of linearly active S-boxes are generally not identical in Section 3 and 
we proposed some special conditions in which those are identical(See Theorem 3 
and 4). Moreover, we applied our results to analyze three diffusion layers used in 
the block ciphers E2, CRYPTON and Rijndael, respectively. It was also shown 
that these all diffusion layers have achieved optimal security according to each 
their constraints of using operations. 
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Abstract. We obtain an exponential lower bound on the non-linear 
complexity of the new pseudo-random function, introduced recently by 
M. Naor and O. Reingold. This bound is an extension of the lower bound 
on the linear complexity of this function that has been obtained by 
F. Griffin and I. E. Shparlinski. 



1 Introduction 

Let p and I be primes with l\p — 1 and let n > 1 be an integer. 

Denote by Fp the finite field of p elements which we identify with the set 
{0, . . . — 1}. Select an element g G F* of multiplicative order I, that is, 

5^1, 5* = 1- 

Then for each n-dimensional vector a = (oi, . . . , a„) G (F;*)" one can define the 
function 

/a(X) GFp, 

where X = xi ... x„ is the bit representation of an n-bit integer X, 0 < X < 
2” — 1, with some extra leading zeros if necessary. Thus, given a = (m, . . . , a„) G 
(F^)", for each X = 0, . . .,2" — 1 this function produces a certain element of 
Fp. After that it can be continued periodically. 
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For a randomly chosen vector a € M. Naor and O. Reingold [4] 

have proposed the function fa.{X) as an efficient pseudo-random function (it is 
assumed in [4] that n is the bit length of p but similar results hold in much more 
general settings). 

It is shown in [4] that the function fa.{X) has some very desirable security 
properties, provided that certain standard cryptographic assumptions about the 
hardness of breaking the Diffie-Hellman cryptosystem hold. It is also shown in [4] 
that this function can be computed in parallel by threshold circuits of bounded 
depth and polynomial size. 

The distribution properties of this function have been studied in [8] and it 
has been proved that the statistical distribution of fa.{X) is exponentially close 
to uniform for almost all a G (F;*)". 

For the elliptic curve version of this generator similar results have been ob- 
tained in [9] . 

The linear complexity, which is an important cryptographic characteristic of 
this sequence, has been estimated in [2] . 

Here we study the more general question of non-linear complexity. 

Given an integer d > 1 and an 7V-element sequence W\, . . . ,Wn over a ring 
TZ, we define the degree d complexity, L{d), as the smallest number L such that 
there exists a polynomial F(Zi, . . . , over TZ of degree at most d in L variables 
such that 

Wx+l = F{Wx,...,Wx+l-i), X=l,...,N-L. 

The case d = 1 is closely related to the notion of the linear complexity, L, the 
only distinction being that in the traditional definition of linear complexity only 
homogeneous linear polynomials are considered. HowevCT, this distinction is not 
very important since one can easily verify that T(l) < L < L{1) + 1. 

Linear complexity is an essential cryptographic characteristic that has been 
studied in many works, see [1,3, 5, 6, 7]. Since non-linear complexity is harder to 
study, hence much less is known about this characteristic, even though it is of 
ultimate interest as well, see [1,5]. 

In this paper we extend the method of [2] and obtain an exponential lower 
bound on the degree d complexity, La(d), of the sequence fa{X), X = 0, . . . , 2” — 
1, which holds for almost all a G (F^")". 

Throughout the paper, log z denotes the binary logarithm of z. 

2 Preparations 

We need some statements about the distribution in F^ of products of the form 

z = (zi,...,z„) G {0, 1}™, 

which are of independent interest. 

Denote 

i = (i,...,i)G{o, ir, 

so that b‘ = &i . . . bm- 
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Lemma 1. For all but at most 



N„ 



<Y^r 



2™ -h r - 2 



(l-l) 



m— 1 



vectors b = {bi, . . bm) G (IFD’” 

+ ... + h^- ^ b‘, 



for any choice of r < d vectors G {0, I}™ with ^ i, n = 1, . . . ,r. 

Proof. For all r = 1, . . . , d, let denote the set of all non-equivalent r-tuples 

(zi, . . . , Zr) with z^ G {0, I}™ and z^ ^ i, n = 1, . . . ,r, where two r-tuples are 
considered to be equivalent if one is a permutation of the other. 

The cardinality ffZr of this set is equal to the number of solutions of the 
equation 

2"'-l 

nfc = r 

k^l 

in nonnegative integers ni, . . . , n 2 m_i. Indeed, if we list the vectors 



Vfc G {0, ir\{i}, fc=l,...,2--l, 



then every r-tuple in Zr is uniquely defined by the number of times Uk that the 
vector Vfc occurs in the r-tuple. 

Therefore 



ffZr = 



2™ -h r - 2 



For each r-tuple (zi, . . . , Zr) G Zr the number of solutions of the equation 



b^i + ...-hb^" = h'\ 



in b G (IF;*)™ does not exceed r{l — 1)™“^. This can easily be proved for all 
m> Ihy induction in r. 

It is convenient to start the induction with r = 0 where the statement is 
clearly true for all m > 1 (the equation b‘ = 0 has no solutions). 

Otherwise we select j such that the vector Zr has a zero jth component. This 
is always possible because Zr yf i. Then the above equation can be written in 
the form A = Bbj where A and B do not depend on bj. Because of our choice 
of j, we see that by induction, B vanishes for at most (r — l)(l — 1)™~2 vectors 
(&i, . . . , bj-i, &J+ 1 , . . . , bm) G (F^)™“^ and in this case we have at most I — 1 
values for If B yf 0 then for any vector (&i, . . . , bj-i, bj+i , . . . , bm) G (F^)™“^ 
the value of bj is defined uniquely. Therefore the number of solutions does not 
exceed (r—l)(l — l)’^~^ + (l — l)’^~^ = r(l — 1)™“^. This completes the induction. 
Accordingly, 

d 

JV^,d<'^r(l-l)’^-^#Zr 

r—1 



and the bound follows. 



□ 
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We also need the following Lemma 2 of [2] which shows that for large m, the 
products with z e { 0 , 1 }™ are quite dense in F^. 

Lemma 2. Fix an arbitrary Z\ > 0. Then for all but at most 

Mm < 2-^A~^{l - l)’”+2 

vectors b = {bi, . . .bm) G (IFD’”; ^^6 2™ products h^, z G {0, I}™ take at least 
I — 1 — A values from F^ . 



3 Lower Bound of the Degree d Complexity 

Now we are prepared to prove our main result. 

Theorem 1. Assume that for some 7 > 0 

n > (1 + 7 )logL 

Then for any integer d > 1 and any d > 0 the degree d complexity, Lg,{d), of the 
sequence fa_{X), X = 0, . . . , 2" — 1, satisfies 



Ls,{d) > 

for all but at most 

N < 



0.5(1- i /7 > 1 + 1/d; 
0.5(? - i/ 7 < 1 + 1/d; 



d + 1 
~dT 



+ o{l)]{l-l) 



1—5 



i 



vectors a e (F/)" 

Proof. If d > max{l, 7 } then the bound is trivial. Otherwise we put 



t = 






and 



Therefore 



Z\ = 



(l-l) 



2 * + d' 
d 



+ 1 



s = n — t 



- 1 . 



2"® = 2*"” < 2*r^"^. 

From the inequality 2*"^ + — 1)^“*^ we see that 



r—l 



2 * + r - 2 



< 



1 

(d^ 



+ 0 ( 1 ) ( 1 - 1 ) 



1-5 



( 1 ) 



(2) 
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We also have 

2 t(d+i) < and A>{d\ + o(l)) {I - 1)2-“^. 

From Lemmas 1 and 2 and the bounds (1), (2) and (3) we derive 



and 






2* + r-2 



{i-iy 



< 



id -1)1 



+ 0 ( 1 ) ( 1 - 1 ) 



t—S 



(3) 



Ms < 2-M-i(? - l)®+2 < (^^ + o(l)^ 2*(‘^+i)(l - 1)®-T' 

< (]|+o(l)) (^-l)*-'- 

Let A be the set of vectors a G (IF)')" such that simultaneously 
#{af ...af I (yi, . . ., 2 /s) G {0, 1}®} >1-1- A 

and 

d 

^ ' ®s + l • • • ^n"'' y Qs+1 ■ ■ - ttn 
u-1 

for any (fci,^, . . . , kt,„) G {0, 1}* with kt,A) yf (1, • • • , 1). 

Then, from the above inequalities, we derive 

1)" - W,d(^ - l)"-‘ - Ms{l - 1)"-* 

^ - 1 )” - ( 

-(]i + o(l)) (Z-1)"-' 

= (^-i)”-(^+o(i)) {i-i)-y 

We show that the lower bound of the theorem holds for any a G thus from 
N < {I — 1)" — ^A and the above inequality we obtain the desired upper bound 
on N. 

Let us fix a G A. Assume that La(d) < 2* — 1. Then there exists a polynomial 
F{Zi , . . . , Z2t-i) G Fp [Zi , . . . , Z2-t-i ] , 



such that 



F (/a(A), . . . , /a(A + 2* - 2)) = /, (X + 2* - 1) 
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forallX = 0,...,2”-2*. 

Now suppose X = 2*Y, where Y = . . . t/s is an s-bit integer, and let 

K = ki ... kt he a t-bit integer. We remark that the bits of K form the rightmost 
bits of the sum X + K. Then we have 

/a(2*y + if) = y = 0,...,2«-i, 

where 

6K = a’:],,...at\ K = 0,...,2*-l, 

and K = ki ... kt is the bit expansion of K. 

Denote by ^g,{u) the following exponential polynomial 

= F {g^,...,g^t_2) -92^-1, u€Wi, 

where 

gK = g^^, K = 0,...,2*-l. 

Collecting together terms with equal values of exponents and taking into 
account that, because of the choice of the set A, the value of 



is unique, we obtain that can be expressed in the form 

R 

<P^{u) = Y,c.K, 

l/=l 



where 



1< i?< 



2* + d- 1 
d 



+ 1) 



with some coefficients Ci, € F* and pairwise distinct hi, € F*, v = 1, . . .,R. 

Recalling that a G we conclude that ^a(w) yf 0 for at most A values 
ofw = — 1. On the other hand, from the properties of Vandermonde 

determinants, it is easy to see that for any u= — 1, ‘^^{u + u) yf 0 for at 

least one ?; = 0, . . . , i? — 1. Therefore, ^a{u) yf 0 for at least 



{l-l)/R> (1-1) 



2* + d - 1 




> Z\ 



values ofw=l,...,Z— 1, which is not possible because of the choice of A. The 
obtained contradiction implies that La{d) > 2*. □ 



4 Remarks 

It is useful to recall that typically the bit length of p and I are of the same order 
as n. Thus 



logp X log I X n. 
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In the most interesting case n is the bit length of p, that is, n ~ logp. In this 
case Theorem I implies a lower bound on L^^d) which is exponential in n, if 
I < p^~^ for some £ > 0. On the other hand, it would be interesting to estimate 
the linear and higher degree complexity for all values oil < p. 

It is also an interesting open question to study the linear complexity or higher 
degree complexity of single bits of fa.{X). For example, one can form the sequence 
Pa{X) of the rightmost bits of fa{X), X = 0, . . .,2” — 1, and study its linear 
and higher degree complexity (as elements of IF 2 ). Unfortunately we do not see 
any approaches to this question. 
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Abstract. We introduce the concept of complementary plateaued func- 
tions and examine relationships between these newly defined functions 
and bent functions. Resnlts obtained in this paper contribnte to the fur- 
ther understanding of profound secrets of bent functions. Cryptographic 
applications of these results are demonstrated by constructing highly 
nonlinear correlation immune functions that possess no non-zero linear 
structures. 

Keywords: Plateaued Functions, Complementary Plateaued Functions, 
Bent Functions, Cryptography 



1 Introduction 

Bent functions achieve the maximum nonlinearity and satisfy the propagation 
criterion with respect to every non-zero vector. These functions, however, are 
neither balanced nor correlation immune. Furthermore they exist only when the 
number of variables is even. All these properties impede the direct applications of 
bent functions in cryptography. They also indicate the importance of further un- 
derstanding the characteristics of bent functions in the construction of Boolean 
functions with cryptographically desirable properties. This extends significantly 
a recent paper by Zheng and Zhang [12] where a new class of functions called 
plateaued functions were introduced. In particular, (i) we introduce the concept 
of complementary plateaued functions; (ii) we establish relationships between 
bent and complementary plateaued functions; (iii) we show that complemen- 
tary plateaued functions provide a new avenue to construct bent functions; (iv) 
we prove a new characteristic property of non-quadratic bent functions by the 
use of complementary plateaued functions; (v) As an application, we construct 
balanced, highly nonlinear correlation immune functions that have no non-zero 
linear structures. 
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2 Boolean Functions 



Definition 1. We consider functions from Vn to GF(2) (or simply functions 
on Vn), Vn is the vector space of n tuples of elements from GF(2). Usually we 
write a function f on Vn as f{x), where x = {x \, . . . , Xn) is the variable vector 
in Vn- The truth table of a function f on Vn is a (0,1) -sequence defined by 
(/(ao): /((ti),..., /(a 2 '»-i)); and the sequence of f is a (1, —l)-sequence de- 
fined by (_l)/(ai)^ _ _ _ ^ where «o = (0, • • • , 0, 0); = 

(0, . . . , 0, 1), . . a 2 "-i-i = (1) • • • ) 1) !)• The matrix of f is a (1, —l)-matrix of 
order 2^ defined by M = ((— where © denotes the addition in GF{2). 
f is said to be balanced if its truth table contains an equal number of ones and 
zeros. 



Given two sequences d = (oi, • • • , Om) and b = (&i, • • • , bm), their component- 
wise product is defined by a * & = (ai6i, • • • , ambm)- In particular, if m = 2" and 
d, b are the sequences of functions / and g on Vn respectively, then a * & is the 
sequence of f (B g where © denotes the addition in GF{2). 

Let d = (ai,---,am) and b = (bi,---,bm) be two sequences or vectors, 
the scalar product of d and b, denoted by (d,b), is defined as the sum of the 
component-wise multiplications. In particular, when d and b are from Vm, (a, b) = 
aibi © • • • © ambm, where the addition and multiplication are over GF(2), and 
when d and b are (1, — l)-sequences, (a, b) = aibi, where the addition and 

multiplication are over the reals. 

An affine function / on 14, is a function that takes the form of f{xi , . . . , x„) = 
a\Xi © • • • © OnXn © c, where Oj, c G GF(2), j = 1, 2, . . . , n. Furthermore / is 
called a linear function if c = 0. 

A (1, — l)-matrix A of order m is called a Hadamard matrix if AA^ = mim, 
where is the transpose of A and Im is the identity matrix of order m. A 
Sylvester-Hadamard matrix of order 2", denoted by Hn, is generated by the 
following recursive relation 



Ho = 1 , Hn 



Hn—1 Hn—1 
Hn—1 Hn—1 



n = 1, 2, . . . . 



Let 4, 0 < i < 2" — 1, be the i row of Hn- It is known that 4 is the sequence 
of a linear function ipi{x) defined by the scalar product (fi{x) = (ai,x), where 
is the tth vector in Vn according to the ascending alphabetical order. 

The Hamming weight of a (0, l)-sequence denoted by HW{f(), is the num- 
ber of ones in the sequence. Given two functions / and g on Vn, the Hamming 
distance d{f,g) between them is defined as the Hamming weight of the truth 
table of f{x) © g{x). 

The equality in the following lemma is called Parseval’s equation (Page 416 

[ 4 ]). 

Lemma 1. Let f he a function on Vn and ^ denote the sequence of f. Then 



2^-1 
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where li is the ith row of Hn, i = 0, 1, . . . , 2" — 1 . 

Definition 2. The nonlinearity of a function f on Vn, denoted by Nf, is the 
minimal Hamming distance between f and all affine functions on Vn, i.e., Nf = 
minj_i 2 2 '*+! d{f, ipi) where ipi, ip 2 , ■ ■ ■, o,ll the affine functions on 

K. 



The following characterizations of nonlinearity will be useful (for a proof see 
for instance [5]). 



Lemma 2. The nonlinearity of f on Vn can be expressed by 

Nf = 2”-i - i max{|($, ^i)|, 0 < z < 2” - 1} 

where f is the sequence of f and £o, ■ ■ •> ^ 2 »-i are the rows of Hn, namely, the 
sequences of linear functions on Vn- 

The nonlinearity of functions on Vn is upper bounded by 2"“^ — 25"“^. 

Definition 3. Let f be a function on Vn- For a vector a G Vn, denote by f{a) 
the sequence of f{x (B a)- Thus ^(0) is the sequence of f itself and ^(0) *^(a) is 
the sequence of f{x) 0 f{x 0 a). Set 

^/(a) = (C(0)>C(a))> 

the scalar product of f{0) and ^{a). S^f{a) is also called the auto-correlation of 
f with a shift a. 

We can simply write Z\/(a) as A{a) if no confusion takes place. 

Definition 4. Let f be a function on Vn- We say that f satisfies the propagation 
criterion with respect to a if f{x) 0 f{x 0 a) is a balanced function, where 
X = {x \, . . . , Xn) and a is a vector in Vn- Furthermore f is said to satisfy the 
propagation criterion of degree k if it satisfies the propagation criterion with 
respect to every non-zero vector a whose Hamming weight is not larger than k 
(see [6])- 

The strict avalanche criterion (SAC) [9] is the same as the propagation cri- 
terion of degree one. 

Obviously, A{a) = 0 if and only if /(x)0/(a;0a) is balanced, i.e., / satisfies 
the propagation criterion with respect to a. 

Definition 5. Let f be a function onVn- a in Vn is called a linear structure of 
f if |Z\(a)| = 2" (i-e-, f{x) 0 f{x 0 a) is a constant)- 

For any function /, Z\(ao) = 2", where ao is the zero vector on Vn- It is easy 
to verify that the set of all linear structures of a function / form a linear subspace 
of Vn, whose dimension is called the linearity of /. It is also well-known that if 
/ has non-zero linear structure, then there exists a nonsingular n x n matrix B 
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over GF{2) such that f{xB) = g{y) 0 h{z), where x = {y, z), y G Vp, z € Vq, 
g is a, function on Vp and g has no non-zero linear structure, and h is a linear 
function on Vq. Hence q is equal to the linearity of /. 

The following lemma is the re-statement of a relation proved in Section 2 
of [2], 

Lemma 3. Let f be a function on Vn and ^ denote the sequence of f. Then 

(Z\(ao), ^(ai), . . . , Z\(a 2 "-i))-ffn = ((?, {£,, • • • , (C, 

where aj is the binary representation of an integer j, j = 0, 1, . . . , 2" — 1 and li 
is the ith row of Hn- 

There exist a number of equivalent definitions of correlation immune func- 
tions [1,3]. It is easy to verify that the following definition is equivalent to Defi- 
nition 2.1 of [1]: 

Definition 6. Let f be a function on Vn and let ^ be its sequence. Then f 
is called a fcth-order correlation immune function if and only if (^, t) = Q for 
every £, the sequence of a linear function p{x) = {a, x) on Vn constrained by 
1 < HW{a) < k. 

For convenience sake in this paper we give the following statement. 

Lemma 4. Let f be a function on Vn and let ^ be its sequence. Then {f, if) = 0, 
where G is the ith row of Hn, if and only if f{x) 0 (ai,x) is balanced, where ai 
is the binary representation of integer i, t = 0, 1, . . . , 2" — 1. 

In fact, is the sequence of linear function p{x) = (ai,x). This proves 
Lemma 4. Due to Lemma 4 and Definition 6, we conclude 

Lemma 5. Let f be a function on Vn and let ^ be its sequence. Then f is a 
fcth-order correlation immune function if and only if f{x) 0 (a, x) where a is 
any vector in Vn, constrained by 1 < HW{a) < k. 

Definition 7. A function f on Vn is called a bent function [7] if = 2” 

for every t = 0, 1, . . . , 2" — 1, where li is the ith row of Hn. 

A bent function on Vn exists only when n is even, and it achieves the maxi- 
mum nonlinearity 2"“^ — 22 "“^. From [7] we have the following: 

Theorem 1. Let f be a function on Vn. The following statements are equivalent: 
(i) f is bent, (ii) the nonlinearity of f, Nf, satisfies Nf = 2"“^ — 22 "“^, (Hi) 
A{a) = 0 for any non-zero a in Vn, (iv) the matrix of f is an Hadamard matrix. 

Bent functions have following properties [7] : 

Proposition 1. Let f be a bent function on Vn and f denote the sequence of f. 
Then (i) the degree of f is at most ^n, (ii) for any nonsingular nx n matrix B 
over GF(2) and any vector (3 G Vp, g{x) = f{xB 0/3) is a bent function, (Hi) 
for any affine function if on Vn, f (B ip is a bent function, (iv) H the 

sequence of a bent function. 
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The following is from [11] (called Theorem 18 in that paper). 

Lemma 6. Let f be a function on Vn (n>2),f^ he the sequence of f, and p is 
an integer, 2 < p < n. If = 0 (mod 2”“^+^), where I j is the jth row of 

Hn, J = 0) 1) • • • ) 2" — 1, then the degree of f is at most p — 1- 

3 Plateaued Functions 

3.1 rth-order Plateaued Functions 

The concept of plateaued functions was first introduced in [12]. In addition to the 
concept, the same paper also studies the existence, properties and construction 
methods of plateaued functions. 

Notation 1. Let f be a function on Vn and f denote the sequence of f. Set 9/ = 
{*1 (C) ^i) 7^ 0: 0 < t < 2" — 1} where £i is the ith row of Hn, t = 0, 1, . . . , 2" — 1. 

We will simply write 9/ as 9 when no confusion arises. 

Definition 8. Let f be a function on Vn and f denote the sequence of f. If there 
exists an even number r,Q<r<n, such that = 2’’ and each (^, takes the 
value of orO only, where £j denotes the jth row of Hn, j = 0,1, ■ ■ ■ ,2~^ — 1, 

then f is called a rth-order plateaued function on Vn- f is also called a plateaued 
function on Vn if we ignore the particular order r. 

Due to Parseval’s equation, the condition ff’ij = 2’’ can be obtained from 
the condition “each takes the value of 2^"“’’ or 0 only, where £j denotes 

the jth row of Hn, j = 0, 1, . . . , 2" — 1” . For convenience sake, however, both 
conditions are mentioned in Definition 8. 

The following can be immediately obtained from Definition 8. 

Proposition 2. Let f be a function on Vn- We conclude (i) if f is a rth-order 
plateaued function then r must be even, (ii) f is an nth- order plateaued function 
if and only if f is bent, (Hi) f is a 0 th- order plateaued function if and only if f 
is affine - 

The next result is a consequence of Theorem 3 of [8] . 

Proposition 3. A partially-bent function is a plateaued function - 

However, it is important to note that the converse of Proposition 3 has been 
shown to be false [12]. 

3.2 (n — l)th-order Plateaued Functions on Vn 

Following the general results on rth-order plateaued functions on Vn [12], in this 
paper we examine in greater depth the properties and construction methods of 
(n — l)th-order plateaued functions on Vn- These properties will be useful in 
research into bent functions. 
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Proposition 4. Let p he a positive odd number and g be a {p — \)th-order 
plateaued function onVp. Then 

(i) the nonlinearity of g, Ng, satisfies Ng = 

(ii) the degree of g is at most ^(p+ 1), 

(iii) g has at most one non-zero linear structure, 

(iv) for any nonsingular p x p matrix B over GF(2) and any vector (3 G Vp, 
h{y) = g{yB 0/3) is also a{p— l)th- order plateaued function, where y G Vp, 

(v) for any affine function ip on Vp, g (B ip is also a {p — l)th-order plateaued 
function on Vp. 



Proof. Due to Lemmas 2 and 6, (1) and (ii) are obvious. We now prove (iii). 
Applying Lemma 3 to function g, we have 

■ ■ ■, r^{P2P-i))Hp = ((^, eo)^, {£,, ei)^, . . . , e2P_i)^) 

where Pj is the binary representation of an integer j, j = 0, 1, . . . , 2^* — 1 and 
Ci is the tth row of Hp. Multiplying the above equality by itself, we obtain 

si)"*. Note that ^{Po) = 2^ and that g is a, {p — 
l)th-order plateaued function on Vp. Hence 2^(2^^ 0 ^'^{Pj)) = 2^^+^. It 

follows that This proves that g has at most one non-zero 

linear structure and hence (iii) is true, (iv) and (v) are easy to verify. □ 



Theorem 2. Letp he a positive odd number and g he a {p—l)th- order plateaued 
function on Vp that has no non-zero linear structure. Then there exists a non- 
singular 2P X 2P matrix B over GF{2), such that h{y) = g{yB), where y G Vp, 
is a {p — l)th-order plateaued function on Vp and also a Ist-order correlation 
immune function. 



Proof. Set 17 = {/3|/3 G Vp, (^, ep) = 0}, where ep is identified with e* and /3 is 
the binary representation of an integer i, 0 < i < 2^ — 1. 

Since #17 = 2^’“^, the rank of 17, denoted rank{f2), satisfies rank{f2) > p—1. 
We now prove rank{f2) = p. Assume that rank{f2) = p— 1. Since #17 = 2^~^, Tl 
is identified with a (p — l)-dimensional linear subspace of Vp. Recall that we can 
use a nonsingular affine transformation on the variables to transform a linear 
subspace into any other linear subspace with the same dimension. Without loss 
of the generality, we assume that 17 is composed of /3o,/3i, . . .,/32p-i_i, where 
each (3j is the binary representation of an integer j, 0 < j < 2^* — 1. By using 
Lemma 3, we have 

((C) eo)^, (C) 6i)^, ■ ■ ■ , (C) e2P-i)^)77p = 2^(Ag(/3o), Ag(/3i), . . . , Ag(/32P_i)) 



and hence 



(0, 0, . . . , 0, 2^+1, 2^+1, . . . , 2^+^)Hp = 2P(Ag(/3o), Ag(/3i), . . . , Ag(/32P_i)) 
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where the number of zeros is equal to 2^~ ^ . By using the construction of Hp and 
comparing the terms in the above equality, we find that Z\g(/ 32 P-i) = —2^. That 
is, / 32 P -1 is a non-zero linear structure of g. This contradicts the assumption in the 
proposition, that g has no non-zero linear structure. This proves rank{fi) = p. 
Hence we can choose p linearly independent vectors 71 , . . . , 7 ^ from 17. 

Let pj denote the vector in Vp, whose jth term is one and all other terms are 
zeros, j = 1, ... ,p. Define a, p x p matrix B over GF{2), such that jjB = pj, 
j = 1, . . .,p. Set h{y) = g{yB'^), where y € Vp and is the transpose of B. 
Due to (iv) of Proposition 4, h{y) is a (p — l)th-order plateaued function on Vp. 
Next we prove that h{y) is a Ist-order correlation immune function. 

Note that h{y) 0 {pj,y) = g{yb'^) © {pj,y) = g{z) © {pj, z{B'^)~^) where 
2 ; = yB'^. 

On the other hand, 

{pj, z{B^)-^) = z{B^)-^pJ = z{B~^fpJ = z{pjB~^f = z'jJ = (z, 7 ^) 

It follows that h{y) © {pj, y) = g{z) © ( 7 ^-, z) where z = yB'^ . 

Note that e.y^ is the sequence of linear function = {-jj , y) . Since 7 ^ G 17, 
(C) ^-yj) = 0- Due to Lemma 4, g{z) © ( 7 ^, z) is balanced. Hence h{y) © {pj, y) is 
balanced. By using Lemma 5, we have proved that h{y) is a Ist-order correlation 
immune function. □ 

Theorem 3. Let p be a positive odd integer and g be a {p— l)th-order plateaued 
function on Vp. If g has a non- zero linear structure, then there exists a non- 
singular 2P X 2P matrix B over GF(2), such that g{yB) = cx\ © h{z) where 
y = (xi, X 2 , • • • , Xp), z = (x 2 , . . . , Xn), cach Xj G GF(2) and the function h is a 
bent function on Vp-i. 

Proof. Since g has a non-zero linear structure, there exists a nonsingular 2^ x 
2^ matrix B over GF{2), such that g*{y) = g{yB) = cxi © h{z) where y = 
{x\, X 2 , . . ., Xp), z = {x 2 , . . . , Xn) and h is a function on Vp-i. We only need to 
prove that h is bent. Without loss of generality, assume that c = 1. Then we 
have g*{y) = xi(Bh{z). Let ij denote the sequence of h. Hence the sequence of g*, 
denoted by satisfies ^ = {r], —rf). Let ei denote the ith row of i7p_i. From the 
structure of Sylvester-Hadamard matrices, (cj, Cj) is the ith row of Hp, denoted 
by i = 0, 1, . . . , 2^~^ — 1, and {a, —ey) is the (2^“^ + i)th row of Hp, denoted 
by £ 2 P-i+i) I = 0, 1, . . .,2P~^ — 1. Obviously 

(e,^i) = 0 , z = 0 , 1 ,... , 2 ^- 1- 1 ( 1 ) 

Since g* is a (p — l)th-order plateaued function on Vp, (1) implies 

(e,4p-i+i) = ± 2 ^(^+'^ * = 0 ,l,..., 2 ^>-i-l ( 2 ) 

Note that ($,^ 2 P-i+i) = 2(r?, Cj), i = 0,1,..., 2^“^ — 1. From (2), (r?, Cj) = 
± 22 (p-P^ 1 = 0,1,..., 2P~^ — 1. This proves that h is a bent function on Vp-\. 

□ 
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4 Complementary (n — l)th-order Plateaued Functions 
on Vn 

To explore new properties of bent functions, we propose the following new con- 
cept. 

Definition 9. Let p he a positive odd number and g\, g 2 he two functions on 
Vp. Denote the sequences of gi and g 2 by and ^2 respectively. Then g\ and 
g 2 are said to he complementary {p — l)th-order plateaued functions on Vp if 
they are {p— l)th-order plateaued functions on Vp, and satisfy the property that 
(Cl) Si) = 0 if and only if (C 2 , ef) yf 0, and (Ci, ef) 0 if and only if (C 2 , e*) = 0. 

The following Lemma can be found in [10]: 

Lemma 7. Let k > 2 be a positive integer and 2^ = a'^ + where a > b > 0 
and both a and b are integers. Then = 2^ and b = 0 when k is even, and 
of = b‘^ = 2^~^ when n is odd. 

Proposition 5. Let p he a positive odd number and gi, g 2 be two functions on 
Vp. Denote the sequences of gi and g 2 by Ci and C 2 respectively. Then g\ and 
g 2 are complementary {p — l)th-order plateaued functions on Vp if and only if 
(Cl) + (C 2 ) = 2P~^^, where Cj is the ith row of Hp, i = 0,1, . . . ,2^ — 1. 

Proof. The necessity is obvious. We now prove the sufficiency. We keep using all 
the notations in Definition 9. Assume that (Ci, ef)'^ + {^ 2 , 6 i)^ = 2 ^*+^, where Ci 
is the ith row of Hp, i = 0,1, . . . ,2^ — 1. Since p -I- 1 is even, by using Lemma 7, 
we conclude (Ci, ei)^ = 2 ^+^ or 0, i = 0, 1, . . . , 2 ^ — 1. Similarly {^ 2 , = 2 ^+^ 

or 0, i = 0, 1, . . ., 2^ — 1. It is easy to see that g\ and g 2 are complementary 
(p — l)th-order plateaued functions on Vp. □ 

Theorem 4. Let p he a positive odd number and g\, g 2 he two functions on Vp. 
Then g\ and g 2 are complementary (p — l)th-order plateaued functions on Vp if 
and only if for every non-zero vector /3 in Vp, Ag^{(3) = —Ag.^{(3). 



Proof. Applying Lemma 3 to function gi and g 2 , we obtain 

(^gi(/3o) + ^92 (/3o)) ^91 (/3l) + ^92 (/3l)) • • • ) Ag^{(32P-l) + Ag^{P 2 P-l))Hp 

= ((Cl) eo)" + (C 2 ) eo)^ (Cl) ei)2 + (C 2 ) el)^ . . . , (Ci) 62 ^- 1 )^ + (C 2 ) 62 ^- 1 )^) (3) 

where fdi is the binary representation of integer i and e* is the ith row of Hp, 
i = 0,l,...,2P -1. 

Assume that gi and p 2 are complementary (p— l)th-order plateaued functions 
on Vp. From (3), we have 

i^giiPo) + ^ 92 (^ 0 ), ^giiPl) + ^92 (/3l)) • • • ) Ag^{P 2 P-l) + ^92 (/32p-i))^P 
^ (2P+i,2P+i,...,2P+1) (4) 
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or 



i^giiPo) + ^92(^0), ^giiPl) + ^92 (/ 3 l)) • • • ) ^giiP2P-l) + ^g2iP2P-l)) 

= 2{l,l,...,l)Hp 

Comparing the jth terms in the two sides of the above equality, we have Z\gj (/3) + 
Ag^{/3) = 2P+\ for /3 = 0, and Ag^{/3) + Ag^{/3) = 0, for /3 yf 0. 

Conversely, assume that Ag^{(i) + Ag^{(i) = 0, for /3 yf 0. From (3), we have 

(2P+\0 ,..., 0)ifp 

= ((Cl) eo)^ + (^2) eo)^, (Cl) ei)^ + (C 2 ) ei)^, . . . , (Ci) e2P_i)^ + {^ 2 , e2P-i)^) 

It follows that (^1, + (^2) Gi)^ = 2^+^, t = 0, 1, . . . , 2^ — 1. This proves that 

gi and 52 are complementary (p — l)th-order plateaued functions on Vp. □ 

By using Theorem 4, we conclude 

Proposition 6. Let p he a positive odd number and g\, g 2 he complementary 
{p — l)th-order plateaued functions on Vp. Then 

(i) (3 is a non-zero linear structure of g\ if and only if (3 is a non-zero linear 
structure of g 2 , 

(ii) one and only one of g\ and g 2 is balanced. 

Proof, (i) can be obtained from Theorem 4. 

(ii) We keep using the notations in Definition 9. From Proposition 5, (^1 , cq)^ 
_ 2P+1 if and only if (^2 ) bq)^ = 0, and (^1 , bq)^ = 0 if and only if {^ 2 , bq)^ = 2^+^. 
Note that bq is the all-one sequence hence (Cj^eo) = 0 implies gj is balanced. 
Hence one and only one of gi and 52 is balanced. □ 

Proposition 7. Let p be a positive odd number and g\, g 2 he complementary 
{p—\)th- order plateaued functions onVp. For any f3,j G Vp, set g\{y) = 5i(y0/3) 
and 52(2/) = 52(y©7)- Then g\{y) and g^iv) are complementary {p— l)th-order 
plateaued functions on Vp. 

Proof. Since gi, 52 are complementary (p — l)th-order plateaued functions on 
Vp, from Theorem 4, for any non-zero vector a in Vp, Ag^{a) = —Ag^{a). On 
the other hand, it is easy to verify Z\g*(a) = Ag^^a), where a is any vector 
in Vp. Hence for any non- zero vector (3 in Vp, Z\g,(a) = —Ag*{a). Again, by 
using Theorem 4, we have proved that g\, g^ are complementary {p— l)th-order 
plateaued functions on Vp. By the same reasoning, we can prove that g\ and g 2 
are complementary (p — l)th-order plateaued functions on Vp. □ 

Now fix (3, i.e., fix g\ in Proposition 7, and let 7 be arbitrary. We can see 
that there exist more than one function that can team up with g\ to form 
complementary (p — l)th-order plateaued functions on Vp. This shows that the 
relationship of complementary (p — l)th-order plateaued functions on Vp is not 
a one-to-one correspondence. 
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Theorem 5. Let p be a positive odd number and ^2 be two (1, —1) sequences 
of length 2^. Set rji = and rj2 = Then 

and ^2 are the sequences of complementary {p—l)th- order plateaued functions 
on Vp if and only ifiji and 772 are the sequences of complementary {p—l)th-order 
plateaued functions on Vp. 

Proof. Assume that and ^2 are the sequences of complementary (p— l)th-order 
plateaued functions on Vp respectively. It can be verified straightforwardly that 
both r]i and 772 are (1,-1) sequences. Hence both 771 and 772 are the sequences 
of functions on Vp. 

Furthermore we have 

^Hp = 2^(^>+i)(i(ei + 6)), mHp = - 6)) (5) 

Note that both + ^2) and ^(^1 — ^2) are ( 0 , 1 , — 1 ) sequences. From (5), 
(771, Ci) and (772, Ci), where Ci is the tth row of Hp, t = 0, Ij • • • : 2^* — 1, take the 
value of ±22^^’+^^ or 0 only. On the other hand, it is easy to see that the tth 
term of ± ^2) is non-zero if and only if the tth term of ^2) is zero. 

This proves that (771, ef) yf 0 if and only if (772, ef) = 0 , also (771, ef) = 0 if and 
only if (772, Ci) yf 0 , i = 0 , 1 , . . . , 2^* — 1 . By using Proposition 5 771 and 772 are the 
sequences of complementary (p — l)th-order plateaued functions on Vp. 

Conversely, Assume that 771 and 772 are the sequences of complementary (p — 
l)th-order plateaued functions on Vp. Note that = 2~2(.P+P(^rji + r]2)Hp and 
^2 = 2~2iP+P[r]i — r]2)Hp. Inverse the above deduction, we have proved that 
and ^2 are the sequences of complementary {p — l)th-order plateaued functions 
on Vp. 

□ 

In Section 5, we will prove that the existence of complementary (n — 2)th- 
order plateaued functions on Vn-i is equivalent to the existence of bent functions 



5 Relating Bent Functions on to Complementary 
(n — 2)th-order Plateaued Functions on V„,_i 

Lemma 8. Let n be a positive even number and f be a function on Vn. Denote 
the sequence of f by f = (^1,^2); where both and ^2 are of length 2"“^. Let 
and ^2 be the sequences of functions f\ and f2 on 14,-1 respectively. Then 
f is bent if and only if f\ and f2 are complementary {n — 2)th-order plateaued 
functions on I4,_i. 

Proof. Obviously, fHn = ((^, io), if, h), ■ ■ ■, ^2"-i)) where £j is the jth row 

of j = 0, 1 , . . . , 2” - 1. Hence 

Thn—l Thn—1 

Thn—1 Thn—1 



(Cl) 6) 



((C) ^ 0 ), (C) h),- ■ (C) ■^2»-i)) 



( 6 ) 
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For each j, 0 < j < 2 " ^ — 1 , comparing the jth terms in the two sides of equality 
(6), also comparing the 2"“^ + j terms in the two sides of the equality, we find 



(Cl) + (^2, Cj) = (Cl, Cj) - (^2, ej) = (5,£2'“-i+i) ( 7 ) 

Cj is the jth row of Hn-i, j = 0 , 1 , . . . , 2”“^ — 1 . 

Assume that / is bent. From Theorem 1 , \{^,£j) \ = 2 ^” and |(5,£2»-i+j)l = 
25 ", j = 0, 1 , . . . , 2"“^ — 1. 

Due to ( 7 ), i(Ci,ej) + (C2,ejj| = |(Ci,ejj - {^2,ej)\ = 2^". This causes 
(Ci,ej) = 2 ^" and (C2,ejj = 0 otherwise (Ci,ejj = 0 and (C2,ejj = 2^". This 
proves that /i and /2 are complementary (n — 2)th-order plateaued functions on 
K-i. 

Conversely, assume that fi and /2 are complementary (n — 2 )th-order 
plateaued functions on 14,-1- ^From Proposition 5 , for each i, 0 < i < 2 "“^ — 1 , 
(Cl, 6i) and (^1, a) take the value of ± 25 " or 0 only. Furthermore (^i, Ci) = 0 im- 
plies (^2, 6i) yf 0 , and (^i, Ci) yf 0 implies (^2, e*) = 0 . ^From ( 7 ), = ±25" 

and (^,f2»-i+j) ± 25", j = 0 , 1 , . . ., 2 "“^ — 1 . Due to Theorem 1 , / is bent. 

□ 



Lemma 8 can be briefly restated as follows: 

Theorem 6. Let n be a positive even number and f be a function on 14 , • 
Then f is bent if and only if the two functions on 14,-1, f(fi,X2, ■ ■ - and 
/(I, X2, . . . , Xn), are complementary {n — 2 )th-order plateaued functions on 14,-1- 

Proof. It is easy to verify that f{xi, . . . ,Xn) = (1 © xi)/( 0 , CC2, - - - , x„) © 
cci/(l,a;2, . . -,a;„). Set fi{x2, ■ ■ ■ ,Xn) = /(O, X2, . . . , cc„) and f2{x2, ■ ■ ■ ,Xn) = 
f{l,X2,...,Xn). Denote the sequences of /i and /2 by and ^2 respectively. 
Obviously, the sequence of /, denoted by satisfles ^ = (^1 , ^2) - By using 
Lemma 8, we have proved the theorem. □ 

Due to Theorem 6, the following proposition is obvious. 

Proposition 8. Let n be a positive even number and f be a function on 14 ,- 
Then f is bent if and only if the two functions on 14,-1, 

/(xi , . . .,Xj-i, 0 ,Xj+i, ...,Xn) and 

f{xi , . . . , Xj-i, 1, Xj+i, . . . , Xn) are complementary {n— 2 ) th- order plateaued func- 
tions on 14,-1- j = 1, . . . , n. 

The following theorem follows Theorem 6 and Proposition 7 . 

Theorem 7 . Let n be a positive even number and f be a function on 14 ,- Write 
X = (xi, . . . ,Xn) and y = (x2,...,x„) where Xj G GF( 2 ), j = l,...,n. Set 
/i(x2, - - - ,x„) = /(0,X2, - - - ,x„) and f2{x2, ■■■,x„) = /(1,X2, - - - ,x„). Then f 
is bent if and only if g{x) = (1 © xi)fi{y © 71) © X\f2{y © 72) is bent, where 71 
and 72 are any two vectors in 14,-1- 

By using Theorem 5 and Lemma 8, we conclude 
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Theorem 8. Let ^ = (^1,^2) be a ( 1 ,- 1 ) sequence of length 2 ”, where both 
and ^2 are of length 2 "“^. Then f is the sequence of a bent function if and only 
z/2“2"((^i + ^2)Hn-i, (Cl ~ f,2)Hn-i) is the sequence of a bent function. 

Theorems 6, 7 and 8 represent new characterisations of bent functions. In 
addition, Theorems 7 and 8 provide methods of constructing new bent function 
from known bent functions. 

6 Non-quadratic Bent Functions 

Definition 10. Let f be a function on Vn and W be an r- dimensional linear 
subspace of W. From linear algebra, Vn can be divided into 2 "“’’ disjoint cosets 
ofW: 

14 = [/O U t/l U • • • U U2r^-r_l 

where Uq = W , ffUj = 2 ’’, j = 0 , 1 ,..., 2 "“’’ — 1 , and for any two vectors 7 
and (3 in Vn, (3 and 7 belong to the same coset Uj if and only if /3(Bj € W. The 
partition is unique if the order of the cosets is ignored. Each Uj can be expressed 
as Uj = 7j 0 IT where 7j is a vector in Vn and jj 0 W denotes {7^ 0 a|a G IT} 
however 7^ is not unique. For a coset [/ = 7 0 IT, define a function g on W 
such that g{a) = /(7 0 a) for every a G IT. Then g is called the restriction of / 
to coset 7 0 IT. g can be denoted by /^©w- dn particular, the restriction of / to 
linear subspace IT, denoted by fw, is a function h on W such that h{a) = /(a) 
for every a G IT. 

Proposition 9. Let f be a bent function on Vn and W be an arbitrary (n — 1 )- 
dimensional linear subspace. Let Vn divided into two disjoint cosets: Vn = ITUT. 
Then the restriction of f to linear subspace W, fw, and the restriction of f to 
coset U , fu, are complementary {n — 2)th-order plateaued functions on I4-i- 

Proof. In fact, IT* = {( 0 , X2, ■ ■ ■, Xn)\x2, ■ ■ - ,Xn G GF{2)} forms an (n — 1 )- 
dimensional linear subspace and U* = {( 1 , X2, ■ ■ ■, Xn)\x2, . . . , G GF(2)} is a 
coset of IT. By using a nonsingular linear transformation on the variables, we 
can transform IT into IT* and U into U* simultaneously.. By using Theorem 6, 
we have proved the Proposition. □ 

Proposition 9 shows that the restriction of / to any (n — l)-dimensional linear 
subspace is still cryptographically strong. 

We now prove the following characteristic property of quadratic bent func- 
tions. 

Lemma 9. Let f be a bent function on Vn. Then for any {n — 1)- dimensional 
linear subspace W, the restriction of f to W has a non-zero linear structure if 
and only if f is quadratic. 



Proof. Let / be quadratic and IT be an arbitrary (n — 1 (-dimensional linear 
subspace. Since n — 1 is odd, the restriction of / to IT, denoted by g, is not bent. 
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Hence due to (iii) of Theorem 1, there exists a non-zero vector (3 in W , such that 
g{y) © g{y © (3) is not balanced. On the other hand, since g is also quadratic, 
g{y) © g{y © (3) is affine. It is easy to see that any non-balanced affine function 
must be constant. This proves that /3 is a non-zero linear structure of g. 

We now prove the converse: “if for any (n — l)-dimensional linear subspace 
W , the restriction of / to IT has a non-zero linear structure, then / is quadratic” 
by induction on the dimension n. 

Let n = 2. Bent functions on V 2 must be quadratic. For n = 4, from (i) of 
Proposition 1, bent functions on V4 must be quadratic. 

Assume that the converse is true for 4 < n < fc — 2 where k is even. We now 
prove the converse for n = k. 

Let / be a bent function on I4 such that for any (k — l)-dimensional linear 
subspace W the restriction of / to IT has a non-zero linear structure. 

It is easy to see that / can be expressed as f{x) = x\g{y) © h{y) where 
y = {x 2 , • • • , Xk), both g and h are functions on Vk-i- From Theorem 6, 

/(O, CC2 , . . . , Xk) = h{y) and /(I, CC2 , . . . , Xk) = g{y) © h{y) are complementary 
(fc — 2)th-order plateaued functions on Vk-i- 

Since {(0, X 2 , • • • , Xk)\x 2 , . . .,Xk € GF(2)} forms a (fc— l)-dimensional linear 
subspace, due to the assumption about /: “the restriction of / to any (fc — 1)- 
dimensional linear subspace has a non-zero linear structure”, /(O, X 2 , ■ ■ ■ , Xk) = 
h{y) has a non-zero linear structure. Without loss of generality, we can assume 
that the vector (3 in Vk-i, (3 = (1,0,..., 0), is the non-zero linear structure 
of h{y). It is easy to see h{y) = 0 x 2 © h{z) where c is a constant in GF{2), 
z = (x3, . . . , Xk) and b{z) is a function on Vk- 2 - Without loss of generality, we 
assume that c = 1. From Theorem 3, b{z) is a bent function on Vk- 2 - 

It is easy to see 2\h{P) = —2^~^. From Theorem 4, /3 = (1,0,..., 0) is 
also a linear structure of g{y) © h{y) and = 2^~^. Hence g{y) © h{y) 

can be expressed as g{y) © h{y) = dx 2 © p{z), where z = (0:3, . . . , Xk)- Due to 
Theorem 3, p{z) is a bent function on Vk- 2 - Since Agf^hiP) = 2^“^, d= 0. Hence 
g{y) = h{y) ® p{z) = X 2 ® b{z) ® p{z) and hence 

f{x) = xi{x 2 © b{z) ®p{z)) © CC2 © b{z) (8) 

Since {{x\, 0, x ^, . . . , Xk)\xi, X 3 , . . .,Xk G GF(2)} forms a (fc— l)-dimensional 
linear subspace, f{xi, 0, X 3 , . . . , Xk) is the restriction of / to this (fc — l)-dimen- 
sional linear subspace. Due to the assumption about /, f{xi, 0, X 3 , . . . , Xk) has 
a non-zero linear structure, denoted by 7, 7 G Vk-i- From (8), 
f{u) = f{xi, 0 , X 3 , . . Xn) = xi{b(z) © p{z)) © b{z), where u G Vk-i and u = 
(xi,X3,X4, - ---Xk)- 

There exist two cases of 7. 

Case 1:7= (0, p) where p G Vk- 2 - Since 7 yf 0, ^ is non-zero. It is easy to 
see f{u) © f{u © 7) = xi{b{z) © b{z © p) (Bp{z) (Bp{z © p)) © b{z) © b{z © p). 

Since f{u) © f{u © 7) is a constant, b{z) © b{z (B p) (B p{z) © p{z (B p) = 0 
and b{z) © b{z (B p) = c', where c' is constant. On the other hand, since b{z) is 
bent and pj^O, b{z) © b{z © p) is balanced and hence it is not constant. This is 
a contradiction. This proves that Case 1 cannot take place. 
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Case 2 : 7 = ( 1 , z/) where v G Vk-2 and v is not necessarily non-zero. It is easy 
to see f{u) 0 /'(m0 7) = xi{h{z) 0 h{z® v)®p{z) ®p{z 0 v)) 0 h{z) ®p{z® v). 

Since f{u) 0 f{u 0 7) is a constant, h{z) 0 b{z 0 :/) 0 p{z) 0 p{z 0 j/) = 0 
and b{z) 0 p{z (B v) = c", where c" is constant, and hence b{z 0 0 p{z) = c". 

From (8), 

f{x) = X1X2 0 x\{b{z) 0 b{z 0 0 c") 0 X2 0 b{z) ( 9 ) 

We now turn to the restriction of / to another (fc— l)-dimensional linear sub- 
space. Write U* = {(ccs . . . , Xk)\ X3, . . . ,Xk G GF{2)} and [/* = {{x\, X2)\xi, X2 G 
GF{2)}. Hence U* is a (fc — 2 )-dimensional linear subspace and [/* is a 2 - 
dimensional linear subspace, and 14 = (C*,C/*), where (X,Y) = {{a,(i)\a G 
X,/3gY}. 

Let A denote an arbitrary (k — 3 ) -dimensional linear subspace in [/*. Hence 
([/*, H) is a (fc — l)-dimensional linear subspace. 

Let f"(y) denote the restriction of / to ([/♦,H), where y G (Gt,A). Hence 
y can be expressed as y = (xi,X2,v) with v = (vi, . . . , Vk-2) G A, where 
vi, . . Vk-2 G GF{2) but not arbitrary because H is a proper subset of Vfc_2. 

^From ( 9 ), f"{y) can be expressed as f"{y) = X1X2 0 xi{b'{v) 0 b"{v) 0 a) 0 
X2(Bb'{v), where b'{v) denotes the restriction of b{z) to A and b”{v) denotes the 
restriction of b{z 0 v) to A. 

From the assumption about /, f" has a non-zero linear structure 7', 7' G 
(Ut,A). Write 7' = (ai,a2,r) where t G A. Since 7' = (ai,a2,r) is a non-zero 
linear structure of /", it is easy to verify m = 02 = 0 . This proves 7' = ( 0 , 0 , r). 
Since 7' is non-zero, r yf 0 . 

Hence f'{y)®f'{y®^') = Xi(&'(u)0&'(u0r)0&"(u)0&"(u0r))0&'(u)0&'(u0 
r). Since f'{y) 0 f'{y 0 7') is constant, b'{v) 0 b'{v 0 r) 0 b”{v) 0 b”{v 0 r) = 0 
and b'{v) 0 b'(v 0 r) is constant. Hence r is a non-zero linear structure of b'{v). 
This proves that for any (n — 3 )-dimensional linear subspace A, the restriction 
of b{z) to A, i.e., b'{v), has a non-zero linear structure. On the other hand, 
since b{z) is a bent function on Vfc_2, due to the induction assumption, b{z) 
is quadratic. Hence b{z) 0 b{z 0 v) must be affine. From ( 9 ), we have proved 
f{x) = X\X2 0 xi{b{z) 0 b{z 0 :/) 0 a) 0 X2 0 b{z) is quadratic when n = k. □ 

Due to the low algebraic degree, quadratic functions are not cryptographically 
desirable, although some of them are highly nonlinear. 

The following is an equivalent statement of Lemma 9 . 

Theorem 9 . Let f be a bent function on 14 , • Then f is non-quadratic if and only 
if there exists an {n— 1)- dimensional linear subspace W such that the restriction 
of f to W, fw, has no non-zero linear structure. 

Theorem 9 is an interesting characterization of non-quadratic bent functions. 

7 New Constructions of Cryptographic Functions 

The relationships among a bent function on 14 and complementary (n — 2 )th- 
order plateaued functions on 14- 1 are helpful to design cryptographic functions 
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from bent functions. In fact, from Theorem 6, any bent function on V„ can be 
“split” into complementary (n — 2)th-order plateaued functions on V„-i. 

We prefer non-quadratic bent functions as they are useful to obtain comple- 
mentary plateaued functions that have no non-zero linear structures. 

Let / be a non-quadratic bent function on V„. By using Theorem 9, we can 
find an (n — l)-dimensional subspace W such that the restriction of / to W, fw, 
has no non-zero linear structure. For any vector a € V„ with a ^ W, we have 
(a 0 W) n bF = 0 and Vn = W U (a (B W). From Proposition 9, the restriction 
of / to a 0 W , fa^w, and fw are complementary (n — 2)th-order plateaued 
functions on Vn-i- Due to (i) of Proposition 6, /a©w has no non-zero linear 
structure. Due to (ii) of Proposition 6, one and only one of fw and /a©w is 
balanced. From Propositions 4, we can see that both fw and /q©w are highly 
nonlinear. 

Furthermore, by using Theorem 2, we can use a nonsingular linear trans- 
formation on the variables to transform the balanced fw or fai^w into another 
(n— 2)th-order plateaued function g on 14,-1- The resultant function is a Ist-order 
correlation immune function. Obviously g is still balanced and highly nonlinear, 
and it does not have non-zero linear structure. 

We note that there is a more straightforward method to construct a balanced, 
highly nonlinear function on any odd dimensional linear space, by “concatenat- 
ing” known bent functions. For example, let / be a bent function on 14, we can 
set g{xi , . . . , Xk+i) = x\® f{x 2 , • • • , Xk+i)- Then ^ is a balanced, highly nonlin- 
ear function on I4+i, where fc 0 1 is odd. Let g and f denote the sequences of g 
and / respectively. It is easy to see g = (^, —f) and hence g is a, concatenations 
of f and —f. We call this method concatenating bent functions. A major problem 
of this method is that / contains a non-zero linear structure (1, 0, . . . , 0). 

In contrast, the method of “splitting” a bent function we discussed earlier 
allows us to obtain functions that do not have non-zero linear structure. 

8 Conclusions 

We have identified relationships between bent functions and complementary 
plateaued functions, and discovered a new characteristic property of bent func- 
tions. Furthermore we have proved a necessary and sufficient condition of non- 
quadratic bent functions. Based on the new results on bent functions, we have 
proposed a new method for constructing balanced, highly nonlinear and corre- 
lation immune functions that have no non-zero linear structures. 
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Abstract. In this paper, we show a practical solution to the problems 
where one-wayness of hash functions does not guarantee cryptographic 
systems to be secure enough. We strengthen the notion of one-wayness 
of hash functions and construct strongly one-way hash functions from 
any one-way hash function or any one-way function. 



1 Introduction 

Hash functions are important primitives for many cryptographic applications. 
Especially, (one-way) hash functions within digital signature schemes are essen- 
tial ingredients. In general, basic schemes of any digital signature systems are 
existentially forgeable and the one-wayness of hash functions contribute to guar- 
antee the security of existentially unforgeability. For example, in [9] , Pointcheval 
and Sterns showed that the (modified) ElGamal signature scheme is existen- 
tially unforgeable against an adaptively chosen message attack, if hash functions 
are truly random in the sense of the random oracle model. Also, in [3], Bellare 
and Rogaway discussed the security of the RSA signature scheme in the ran- 
dom oracle model. In a sense, random oracle model is rather theoretical than 
practical. 

There are two approaches to constructing a secure cryptographic system in a 
practical model. One is constructing a secure cryptographic system under strong 
assumptions and then weakening the assumptions. The other is constructing a 
secure cryptographic system (in a weak sense of security) in a practical model 
and then strengthening the sense of security. 

In this paper, we show a technique for boosting security according to the later 
approach. One-wayness of hash functions can be a good property that guarantees 
a cryptographic system to be secure while it also can be insufficient to guarantee 
another cryptographic system to be secure. 

Intuitively, one-way functions / are easy to compute, but hard to reverse. 
That is, given x it is easy to compute f{x), but given f{x) it is hard to compute 
x' such that f{x) = f{x'). Some cryptographic applications require that given 
f{x) it be also hard to compute x' such that f{x) is close to f{x'). We say 
that functions of the above property are neighbor-free. In general, one-wayness 
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of functions does not ensure the neighbor-freeness. We note that the notion of 
neighbor-freeness is different from non-malleability [5] and sibling intractability 
[12]. If a digital signature scheme which uses one-way hash functions does not 
have this property, then there is a possibility that the signature scheme is existen- 
tially forgeable. In this paper, we show a way to construct one-way neighbor-free 
hash functions using any one-way hash functions. Combining the results in [11] 
and our results, one-way neighbor-free hash functions are also constructible from 
any one-way functions. 

2 Preliminaries 

A polynomial-time computable function / : {0, 1}* ^ {0, 1}* is one-way if 
for any probabilistic polynomial-time algorithm A, any polynomial p, and suffi- 
ciently large fc, 

Pr[/(^(/(a;),l'')) = fix)] < l/pik), 

where the probability is taken over all x’s of length k and the internal coin tosses 
of A, with the uniform probability distribution. A polynomial-time computable 
function h : {0, 1}* ^ {0, 1}* is a single one-way hash function if it satisfies the 
following: 

(a) For some function £, /i({0, 1}^) C {0, and £{k) < k for sufficiently large 
k. 

(b) For any probabilistic polynomial-time algorithm A, any polynomial p, and 
sufficiently large k, 

Pr[ A(x, 1^) = x', X ^ x\ h{x) = h{x')] <l/p{k), 

where the probability is taken over all cc’s of length k and the internal coin 
tosses of A, with the uniform probability distribution. 

A family H = IJ;.>q Hk, where Hk = {h \ h : {0, 1}^ ^ {0, is a family 

of one-way hash functions if for any probabilistic polynomial-time algorithm A, 
for any polynomial p, and sufficiently large k, 

Pr[ A(/i, cc, 1^) = cc', X ^ x' , h{x) = h{x')] <\/p{k), 

where the probability is taken over all x’s of length k and all h’s in Hk and the 
internal coin tosses of A, with the uniform probability distribution. 

A neighborhood function iV is a polynomial- size neighborhood if there exists 
some polynomial q such that |fV(a;)| < gdxl) for all sufficiently long x. We also 
denote by Nk a polynomial-size neighborhood with the input of length k. We may 
consider that iV is a length-wise family {Nk} of polynomial-size neighborhoods. 

We show some examples of polynomial-size neighborhood. In the following 
examples, we assume that k is the length of inputs to Nk for the sake of simplicity. 

Example 1. Let c be a constant integer. Define Nk{y) = {y' \ dH{y,y') < c}, 
where dn denotes the Hamming distance. Note that |7Vfc(y)| is 0(fc°). 
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In the above example, the distance does not depend on the parameter k. We 
next show an example where the distance depends on the parameter k. 

Example 2. For a string y, We denote by suf^.{y) the string w such that y = vw 
and V is of length [logfcj. Define Nk{y) = {y' \ d 2 {y,y') < logfc}, where ^2 is 
defined as follows: 



y („ ,/\ _ / dH{y,y') if sM/fc(y) = suf^iy'), 
a2[y,yi-<^j, otherwise. 

It follows from Wallis’s Formula (or Stirling’s Formula) that 




for any i such that 0 < t < log k. Therefore, \Nk{y)\ is bounded by a polynomial 
in k. 

In [II], Rompel showed a way to construct a family of universal one-way hash 
functions from any one-way function. 

In this paper, we strengthen the notion of one-way hash functions and con- 
struct families of strongly one-way hash functions from any single one-way hash 
function or any one-way function. 

Definition 1. A family H = IJj,>q Hk, where Hk = {h \ h : {0, 1}^^ {0, 
is a family of one-way neighbor-free hash functions if it satisfies the following: 

For any probabilistic polynomial-time algorithm A, for any polynomial-size 
neighborhood N, for any polynomial p, and for sufficiently large k, 

Pr[ A(/i, X, 1^) = x', X ^ x\ h{x') & Nk{h{x)) \ < l/p{k), 

where the probability is taken over all cc’s of length k and all /I’s in Hk and 
the internal coin tosses of A, with the uniform probability distribution. 

For a finite set S, the notation s €u S means that the element s is randomly 
chosen from the set S with the uniform probability distribution. 



3 Main Results 

Theorem 1. If there exists a single one-way hash function, there exists a family 
of one-way neighbor-free hash functions. 

Proof. Suppose that is a single one-way hash function. Let H = [Jk>o^k, 
where Hk = {h o h \ h G Gk} and Gk is the set of all permutations on {0, 1}^. 
Construction of Gk has been discussed in [4]. Note that, on the condition that 
a permutation h is randomly chosen from Gk with the uniform probability dis- 
tribution, even if x is chosen according to arbitrary probability distribution, the 
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probability distribution of h(x) accords with the uniform probability distribu- 
tion. We will show that H is a family of one-way neighbor-free hash functions. 

Assume, on the contrary, that there exist a probabilistic polynomial-time 
algorithm A, a polynomial p and a polynomial-size neighborhood N whose bound 
polynomial is q such that for infinitely many k, 

Pr[ A(/i o h, X, 1^) = x' , X x', h o h{x') G Nk{h o h{x)) \ 

cc Gc/ {0, 1}'=, hoh^u Hk]>l/p{k). 

Let P be the above probability. Then 

P = Pi[A{h,x,l'^) = x', h{x)y^h{x'), h o h{x') G Nk{h o h{x)) \ 

X Gu {0, 1}^, h Gu Gk] 

= Pr[ A'(y, l'=) = x', y^h{x'), hoh{x')GNk{h{y))\ 

y Gu {0, 1}^, h Gu Gk] 

= Pr[^'(y, y + y', Hy') g Nk{h{y)) \ 

y Gu {0, 1}^, hGu Gk] 

= Pr[ A"(y, 1'=) = y',y^ y' , h{y') G Nk{h{y)) \ y Gu {0, 1}'=]. 

For any y, y' G {0, 1}^, the binary event “h{y') G iVfc(/i(y))” (or, equivalently, 
“Hy) G Nk(Hy')T) can be partitioned into Ei{y, y'),E 2 (y, y '), . . . , Eq(^k){y, y'), 
where Ei{y,y') is the event which occurs if y is different from y' in some fixed 
bit positions according to i. Note that for any i if Ei{y,y') and Ei{y',y”) hold 
then y = y” . That is, 

q{k) 

P = Y,Pr[A"{y,H) = y', y^y', EM) \y 

i=l 

It then follows from the assumption that there exists j such that 

Vr[A"{y,H) = y\ y^y', Ej{y,y') \y Gu {0AH]> {p{k)Hk))~^- 

We set j = j{k) without loss of generality. 

Now, we construct Algorithm B to find a sibling of x using Algorithm A. 
Algorithm B, given x, picks h' at random and gives x and ho h' to Algorithm 
A. Algorithm B then lets x' be the output of A, picks h” at random, and gives 
x' and h o h” to Algorithm A again. Finally Algorithm B lets x” be the second 
output of A and gives x” as its output. 

Then, the probability that B{x) = x" and h{x) = h{x") can be estimated as 
follows. 



Pr[B(x,C) = x", Hx) -- 
A{h, X, H) = x', X 

_A{h,x',H) = x", 



> Pr 



= h{x")\xGu{0,l}H 
yf x' ,Ej(k){Hx),Hx')), 
x' yf x", Ej(^k)iHx'), Hx”)) 



X Gu { 0 , 1 }^, h, h Gu Gk 
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= Pr 



y,z€u {0,1}'^ 



A"{y, l'^) = y',y^ y', Ej(^k){y, y'), 

[ A”{z, l'^) = z', z ^ z', Ej(^k){z, z') 

= (Pr[A"(y, 1'=) = y', y ^ y' , Ej(^k){y,y') \ y &u {OA}'"]) 

> (p(fc)g(fc))"^ 

We next estimate the probability that B{x) = x. 



( 1 ) 



Pr 
= Pr 

= Pr 



X Gu {0, 1}^5 h, h Gu Gk 

x,y,zGu {0,1}'^ 

x,y,zGu {0,1}'^ 



A{h,x,l'") = x', Xy^x', Ej(^k)iHx),Kx')),\ 

[A{h,x',l'") = x, x' ^ X, Ej(^k){Hx'),h{x)) 

A"{y, 1'=) = y', y^ y' , Ej(^k){y, y'), 

_A”{z,l’') = x, z^x, Ej(^k){z,x) 

A"{y, 1'=) = y', y^ y' , Ej(^k){y, 2/')> 

A”{z, l'^) = z', z^ z', Ej(^k){z, z'), 

X = z' 

= Fr[A”{y,i^) = y', y y' , Ej(^k){y,y') \ y &u {OA}^] 

■ ^ (Pr[A"(z, l'=) = z', z z',^;j(fc)(z, z'),a; = z' I z 6(7 {0, = x'] 

x'e{o,i}'= 

•Pr[x = x' \x Gu {0, 1}'^]) 

= Fr[A''{y,A) = y', y A v' , Ej(^k){y,y') \ y (^u {0,1}^] 

^ X! ^^[A''(z,A) = z', zA z', Ej(^k){z,z'), x' = z' \zGu {OA}^] 



x'e{o,i}'= 



1 



= -^ {^"^[^"(yA^) = y', yAy\ Ej,^k){.y,y')\y A>A}’"]Y 



(2) 



Combining (1) and (2), we have 

Pr[_B(cc, l^) = cc", h{x)= h{x"), x A x" \ x Gu {0, 1}^] > (p(fc)(?(fc))“^(l— 1/2^). 

This contradicts that /i is a single one-way hash function. □ 

The above discussion is applicable to the case of a family of universal one-way 
hash functions instead of any single one-way hash function. 



Theorem 2. If there exists a family of universal one-way hash functions, there 
exists a family of one-way neighbor-free hash functions. 

Since Rompel showed a way to construct a family of universal one-way hash 
functions from any one-way function [11], we obtain the following. 

Corollary 1. If there exists a one-way function, there exists a family of one-way 
neighbor-free hash functions. 
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4 Conclusion 

We showed a way to construct a family of one-way neighbor-free hash functions 
using a single one-way hash function and a family of permutations. Since one- 
way neighbor-freeness is stronger notion than one-wayness, we can say that some 
one-way hash functions within cryptographic systems can be replaced with one- 
way neighbor-free hash function families and then the modified systems achieve 
the stronger security. 
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Abstract. We consider the performance of hyperelliptic curve cryp- 
tosystems over the fields Fp vs. F 2 ". We analyze the complexity of the 
group law of the Jacobians Jc(Fp) and Jc'(F 2 ») and compare their per- 
formance taking into consideration the effectiveness of the word size (32- 
bit or 64-bit) of the applied CPU (Alpha and Pentium) on the arithmetic 
of the definition field. Our experimental results show that Jc(F 2 '>) is 
faster than Jc(Fp) on an Alpha, whereas Jc(Fp) is faster than Jc(F 2 ") 
on a Pentium. Moreover, we investigate the algorithm of the Jacobian 
and the definition-field arithmetic to clarify our results from a practical 
point of view, with theoretical analysis. 

Keywords: Hyperelliptic curve cryptosystem, Jacobian, Efficient imple- 
mentation, Lagrange reduction 



1 Introduction 

We implemented the group law of the Jacobians of hyperelliptic curves in soft- 
ware. In particular, we present here a practical comparison between the perfor- 
mance of hyperelliptic curve cryptosystems over Fp and F 2 " . 

1.1 ECC and HECC 

Elliptic curve cryptosystems (ECCs) [Ko87,Mi85] are now being used exten- 
sively in industry [Cer,RSA]. There has been much work done in recent years on 
their implementation [BP98,CM098,GP97,So97,WBV96,WMPW98], and the 
cryptanalysis of ECC’s is still being explored [FMR98,GLV98,MOV93,SA97], 
[Se98,Sm97,WZ98]. As a natural generalization of EGGs, Koblitz [Ko88,Ko89] 
proposed hyperelliptic curve cryptosystems (HEGGs) induced from the group law 
of Jacobians [Ga 87] defined over finite fields as a source of finite abelian groups. 
These Jacobian varieties seem to be a rich source of finite abelian groups for 
which, so far as is known, the discrete log problem is intractable [Ko89]. While 
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attacks on HECCs have been explored [ADH94 ,En99,FR94,Ru 97], the design 
of HECCs vulnerable to these attacks and their implementations have also been 
considered [Sm99,SS98,SSI98]. 



1.2 Previous Work 

Win, Mister, Preneel and Wiener [WMPW98] implemented elliptic curve cryp- 
tosystems in software and compared their performance over the fields Fp and 
F 2 '». Sakai and Sakurai [SS98] implemented the group law in the Jacobian of 
a hyperelliptic curve. However, they only considered curves over F 2 '>. Smart 
[Sm99] reported the performance of the group law in the Jacobian of curves of 
arbitrary genus over both F 2 ~ and Fp. As for theoretical analysis, Enge [En98] 
managed to determine the complexity of the group law of Jacobians and the 
efficiency of HECCs. In particular, Enge gives the average bit complexity of the 
arithmetic in hyperelliptic Jacobians. 

We should note that Enge’s examination [En98] assumes that the complexity 
of field operations is either constant, or grows with logq or log^ q. Indeed, these 
assumptions are theoretically reasonable. However, the complexity of field oper- 
ations can also depend on the word-size (32 or 64) of the applied CPU (Pentium 
or Alpha). In fact, Smart [Sm99] remarks that he chose the values of p and n such 
that p and 2" would be less than 2^^ so as to ensure that the basic arithmetic 
over Fp and F 2 » could all be fitted into single words on a computer. In [Sm99], 
however. Smart presented experimental timing results only for HECDSA(F 2 ") 
and gave no specific data on HECDSA(Fp), although he did give results for the 
case when g = 1, i.e. ECDSA(Fp) and ECDSA(F 2 ~). 

Thus, there are currently no published reports on the performance of hy- 
perelliptic curve cryptosystems over Fp vs. F 2 '> based on experimental data. In 
the case of elliptic curves, the number of field operations for doubling a point 
in E(Fp) is almost the same as that of E(F 2 '>), but in a hyperelliptic doubling 
the number of field operations in Jc(F 2 n), where C has the form + v = f{u), 
is much smaller than in Jc(Fp). However, if field operations in Fp are rela- 
tively efficient compared to those in F 2 ~, then an HECC using Jc(Fp) might be 
faster than one that uses Jc(F 2 ~). In fact, in [WMPW98], a multiplication in 
Fp is faster than in F 2 '> in software implementations. Thus, the question “Which 
is faster, Jc(F 2 '>) or Jc(Fp)?” is never trivial. This is the motivation for our 
comparison, which considers practical aspects and a real implementation. 



1.3 Our Contribution 

We implemented hyperelliptic curve cryptosystems over both Fp and F 2 ", and 
compared their performance. In particular, our implementation takes into ac- 
count the advantage of the single words (32-bit or 64-bit) of our applied CPUs 
(an Alpha or a Pentium) on the arithmetic of the definition field, since mod- 
ern microprocessors are designed to calculate results in units of data known as 
words. 
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The Arithmetic for Definition Fields. We first consider the efficiency of 
the arithmetic for finite field operations, ^ before implementing our hyperelliptic 
cryptosystems. Our HECCs were implemented on two typical CPUs with differ- 
ent word size, one a Pentium II (300MHz), which has 32-bit word size, and the 
other an Alpha 21164A (600MHz), which has 64-bit word size. 



On the Performance of the Jacobians. The most significant part of com- 
puting the group law of Jacobians is finding a particular divisor’s unique reduced 
divisor. Several algorithms along with their improvements [Ca87,En98,Ko89], 
[PS98,Sm 99] are known for such a reduction step. 

When performing an addition, our implemented algorithm uses Lagrange re- 
duction, which was developed by Paulus and Stein [PS98], and was generalized 
by Enge [En 98] to fields of arbitrary characteristic. Enge [En 98] never actually 
implemented his findings, and Sakai-Sakurai [SS98] did not make use of Lagrange 
reduction in their implementation. Only Smart [Sm 99] adopted Lagrange reduc- 
tion in his algorithm. However, no experimental data over Fp is available from 
his study [Sm99]. 

Thus, the results of our work comprise the first reported experimental data 
from implementation of Lagrange reduction over Fp and F 2 " , taking into account 
the word size of the CPU. 

Our conclusion is this: In a typical implementation, for a field multiplication 
in Jc(Fp), with log 2 P = 60, of a genus 3 curve on the Alpha, we need to divide a 
double word integer by a single word integer. This operation is very costly on an 
Alpha, because integer division does not exist as hardware opcode. Therefore, 
this instruction is performed via a software subroutine [DEC]. In the case of 
Jc(Fp), with log 2 P = 29, of a genus 6 curve, an element of the field can be 
represented as a half-size integer on an Alpha. Although a software subroutine 
is still needed for division, the field multiplication is inexpensive compared to 
the genus 3 curve. In fact, our implementation of field operations shows that a 
field multiplication in Fp, with log 2 P = 29, is 4 times faster than in Fp, with 
log 2 P = 60. On the other hand, on the Pentium, a field multiplication in Fp, 
with log 2 P = 15, is only 2 times faster than in Fp, with log 2 P = 30. 

2 A Hyperelliptic Curve and Its Jacobian 

This section gives a brief description of Jacobians and their discrete logarithm 
problem. See [Ko98] for more details. 

2.1 Hyperelliptic Curve 

Let F be a finite field and let F be its algebraic closure. A hyperelliptic curve 
C of genus g over F is an equation of the form C \ h{u)v = f{u) in F[m, u], 
where h{u) G F[m] is a polynomial of degree at most g, f{u) G F[m] is a monic 

^ In [BP98], it is stated that a multiplication in F 2 " takes cn^ steps. 
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polynomial of degree 2g + 1, and there are no solutions {u,v) G F x F that 
simultaneously satisfy the equation + h{u)v = f{u) and the partial derivative 
equations 2v + h{u) = 0 and h' {u)v — f (u) = 0. Thus, a hyperelliptic curve does 
not have any singular points. 

A divisor D on C is a finite formal sum of F-points, D = nriiPi, nii G Z. 
We define the degree of D to be deg{D) = ^ mi. If K is an algebraic extension 
of F, we say that D is defined over K if for every automorphism ct of F that 
fixes K one has Y = D, where P'^ denotes the point obtained by applying 
a to the coordinates of P (and 00 = 00 ). Let D denote the additive group of 
divisors defined over K (where K is fixed), and let D° denote the subgroup 
consisting of divisors of degree 0. The principal divisors form a subgroup P of 
D°. J(K) = D°/P is called the Jacobian!' of the curve C. 

2.2 Discrete Logarithm 

The discrete logarithm problem on Jc(K) is the problem, given two divisors 
Di,D2€ J c (K), of determining an integer m such that D 2 = niDi, if such an 
m exists. 

As with the elliptic curve discrete logarithm problem, no general subexponen- 
tial algorithms are known for the hyperelliptic curve discrete logarithm problem, 
except for some special cases that are easily avoided [MOV93,SA97 ,Se98,Sm 97]. 
Only exponential attacks, such as the baby-step giant-step method, Pollard’s p 
method or the Pohlig-Hellman method, can be applied. See Appendix A, for a 
list of conditions necessary for the Jacobian to be secure. 

3 Curve Generation 

In this section, we give examples of Jacobians that have a almost prime order 
divisible by a large prime of size « All of our Jacobians are designed to 
be secure against known attacks which are discussed in [ADH94,FR94,GLV98], 
[MOV93,PH78]. The Jacobians given in this section will be implemented and 
their efficiencies will be discussed in the later sections. 

3.1 Over a Characteristic Two Field F 2 " 

Beth and Schaefer [BS91] used the zeta-function of an elliptic curve to construct 
elliptic cryptosystems. Analogously, Koblitz [Ko88,Ko89,Ko98] used the zeta- 
function of a hyperelliptic curve to construct Jacobians of hyperelliptic curves 
defined over finite fields. One technical difficulty in our computation is that the 
zeta-function has a complicated form with large degree on general hyperelliptic 
curves. Therefore, unlike the previous cases [BS91,Ko88,Ko89,Ko98] it is not 
easy to compute its exact solutions. However, it is known that the order of a 
Jacobian can be computed without evaluating the solution of its zeta-function 
[St93, Chapter V]. Therefore, we apply this algorithm to rectify our problem. 
See Appendix B. 
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By this algorithm, we have succeeded in finding Jacobians that have a almost 
prime order. For example, Jacobians that have a order divisible by a prime of 
size « are shown in Table 1. ^ 

Remark. Recently, Gaudry [Ga 99] presented a method for speeding up the 
discrete log computations on the Jacobian of HEGs. The method is asymptoti- 
cally faster than the Pollard p method if the genus is greater than 4. Moreover, 
Duursma, Gaudry and Morain [DGM99] gave a way to gain a speed-up by a 
factor if there exist an automorphism of order m. This method is a gener- 
alization of the parallel collision search [GLV98,WZ98]. Therefore, curves with 
genus greater than 4 or automorphisms of large order should be used carefully 
in cryptosystems. 

In practical point of view, we have to take into account these two attacks. 
Gaudry’s attack has a complexity 0(iV^/®), where N denotes the order of the 
group (suppose that N is almost prime). The attack is better than the Pollard 
p method, which has a complexity if g is greater than 4. Therefore, 

the attack could be applicable to the genus 5 and 6 curves given in this paper. 
Moreover, Duursma-Gaudry-Morain attack is applicable to the genus 6 curve in 
Table 1, which has an automorphism of order 4 x 29 [Ga 99]. Also for the other 
curves given in this paper, the attack might be applicable if an automorphism 
exists. However, our results are independent from the specific structure of such 
a curve with large automorphisms, and any technique of our implementation is 
valid for the curves without automorphisms of large order. 



Table 1. Jacobians over ¥ 2 ^ of genus 3,4,5 and 6 curves of the form v'^+v = f{u) 



genus 


■a 


/(w) 


l 0 g 2 Pc(Fq) 


log 2 P, where p|Pc(F 9 )) 


3 


n 


u‘ 


178 


165 


4 




-k m' -k -k 1 


164 


161 


5 


M 


u" -k u® -k 1 


155 


151 


6 


IF 229 


-k u" -k m'' -k u® -k 1 


174 


170 



3.2 Over a Prime Field Fp 

Unlike elliptic curves, it is still difficult, in general, to compute the order of 
hyperelliptic Jacobians. However, in [BK97,Ko98], Koblitz and Buhler suggest 
using curves over some finite prime field Fp of the form + v = rt", where 
n = 2^ -k 1 is an odd prime and p = \ mod n. Its Jacobian is a quotient of the 
Jacobian of the Fermat curve A" -k F" = I. The order of such curves can be 
determined by computing a Jacobi sum of a certain character. It is possible to 
determine a Jacobi sum in polynomial time using the LLL-algorithm [LLL82]. 

^ In [SS98], Jacobians that have a almost prime order of approximately 2*^®°, over 
small characteristic finite fields F 2 ", Fan, Fs^ and Frn, have been tabulated. 
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See Appendix B. The details of the procedure to determine the Jc(Fp) of the 
curve can be found in [BK97,Ko98]. 

The following Jacobians over a prime finite field Fp have a order divisible by 
a prime of size « The Jacobian of the genus 6 curve should be used carefully 

in cryptosystems, because Duursma-Gaudry-Morain attack may be applicable. 

genus 2 curve 

v^ + v = M®/Fp, P = 9671406556917033397660861 (logjP = 84) 

Pc(Fp) = 93536104789297918652114038095154103630376284213875 

= 5® • 748288838314383349216912304761232829043010273711 
logj (largest prime factor) = 160 

genus 3 curve 

v^ + v = u^/Fp, p = 631153760340591307 (logj p = 60) 

Pc(Fp) = 251423300188980936808533376314868064530443303970434811 
= 7® • 733012536994113518392225586923813599214120419738877 
log 2 (largest prime factor) = 169 

genus 6 curve 

+ v = u^^/Fp, p = 269049691 (logjP = 29) 

Pc(Fp) = 379316622800922233782741202725478330656627788904081 

= 13® • 157 • 1099694785886145362618803297853988300944912689 
log 2 (largest prime factor) = 150 



4 Arithmetic for Group Operations in a Jacobian 

This section gives a brief description of the algorithm for adding two points and 
doubling a point on a Jacobian. Addition is accomplished by two procedures. 
First, we compute the composition step, then we compute the reduction step. 
The details of this algorithm are given in, for example, [Ca87,En98,Ko98,PS98]. 



4.1 Composition Step 

Let Di = div(ai,&i) and D 2 = div(o 2 ,& 2 ) be two reduced divisors, both de- 
fined over Fq. Then the following algorithm ® finds a semireduced divisor D = 
div(a', b'), such that D ^ D\ + D 2 [En98]. 

Composition 

1. Perform two extended Euclidean computations to compute 

d = gcd(oi, 02 , hi + h 2 + h) = siOi + 5202 + 53(^1 + h 2 + h) 

2. Set o' = 0102 ! d?' and 

3. h' = hi + (siOi (&2 — ^ 1 ) + s^{f — — hih))/d (mod a) 

To compute d, si, S 2 ^ and S 3 , two extended gcd’s should be performed. If oi and 
02 have no common factor, the composition algorithm is even simpler. Note that 
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This algorithm is more efficient than the algorithm in [Ko98]. 
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the case gcd(ai, 02) = 1 is extremely likely if the definition field is large and a\ 
and 02 are the coordinates of two randomly chosen elements of the Jacobian. 

When oi = 02 and h\ = &2, i-e., doubling an element, we can take S2 = 0. 
Moreover, in the case of char Fg = 2 and h{u) = 1, we can take di = 1, 
Si = S2 = 0, S3 = 1. Therefore, a doubling in Jc(F2n) of the curves of the form 
+ v = f{u) can be simplified as follows. 

Composition : Doubling in Jc(F2») of the curves v‘‘ + v = f{u) 

Set a' = a\^ and 

2 . b' = bi^ + f (mod a) 

Thus, in Jc(F2n) with C : + v = f{u), doubling is much simpler than in 

Jc(Fp). 

4.2 Reduction Step 

To complete the addition, we must find a unique reduced divisor D = div(a, h). 
There are three known algorithms for such a reduction step: Gauss reduction, 
Cantor reduction and Lagrange redMctzon[EN98]. The reduction algorithm shown 
in this subsection was given by Paulus and Stein for hyperelliptic curves over a 
field of odd characteristic [PS98]. A generalized version for arbitrary character- 
istic was given by Enge in [En 98]. It is called Lagrange reduction (Paulus and 
Stein’s algorithm can be traced back to Lagrange). 

Let ao = a' and &o = b' , and compute a sequence (ofc, hk) for k = 1, • • • , t by 
the following algorithm, where t > 0 is the smallest index such that deg at < g. 



Lagrange reduction 



1 . 


«! = (/- 


bo^ — boh)/ao 


2 . 


-bo -h = 


qiai + bi with deg bi < deg oi 


3. 


For fc > 2: 




4. 


Q,k = Clk- 


-2 + qk-i{bk-i — bk-2) 


5. 


—bk-i — 


h = qkttk + bk with deg bk < deg Ok 



Note that in the worst case, a' may have degree 2 g, i.e. deg oi = deg 02 = g. In 
fact, this is the typical case. Therefore, the iteration step of the above algorithm 
may execute quite a few times. 

In most cases, this reduction algorithm is more efficient than Gauss reduction 
and Cantor reduction. The most costly steps in the Gauss reduction procedure 
are the computations of the a^’s, each involving one multiplication and one 
division of rather high degree polynomials. In the original reduction algorithm, 
each step is independent of the previous one. On the other hand, in the Lagrange 
reduction, as soon as one reduction step has been carried out, the formula for Ok 
can be rewritten, using information from the previous step. In the later sections, 
we will implement the algorithm for an adding and a doubling in a Jacobian, 
using the algorithm described in this section. 

In [SS98], hyperelliptic cryptosystems over fields of characteristic 2 were imple- 
mented and their efficiency was discnssed. However, this implementation made no 
use of Lagrange reduction. 
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5 Arithmetic for a Finite Field and Its Efficiency 

Before implementing our hyperelliptic cryptosystems, we considered the effi- 
ciency of finite field operations. Our implementation was on two different CPUs, 
a Pentium II (300MHz), which has 32-bit word size, and an Alpha 21164A 
(600MHz), which has 64-bit word size. In the later sections, we will analyze 
the performance of hyperelliptic curve cryptosystems based on these efficiencies. 



5.1 Representation of Field Elements 

Representation in Fp. In our implementation of hyperelliptic cryptosystems, 
for Fp, we represented the elements as numbers in the range, [0,p— 1], where 
each residue class is represented by its member in that range. This is clearly the 
most canonical way. Another possibility would be to use Montgomery residues. 
In our cryptosystems, the size of a field element was relatively small, and so 
could be represented as a single word integer on our computers. An analogous 
representation using Montgomery residues would not have been efficient, since 
we would have needed to compute an extra transformation. Therefore, we chose 
the more natural representation. 



Representation of F 2 " . For F 2 ~ , several methods of representation are known. 
Two such methods are the standard basis representation and the optimal nor- 
mal basis representation. A third representation lists elements of the field as 
polynomials over a subfield of the form F 2 »-, where r is a divisor of n. However, 
in order to make ttJc(F 2 '>) divisible by a large prime, we chose n itself to be 
prime. Thus, this third method was not applicable. The optimal normal basis 
representation enables efficient implementation in hardware. However, in our ex- 
perience, this method is inefficient in software, compared to the standard basis 
representation. Therefore, we implemented our cryptosystems using standard 
basis representation. These implementations can be made more efficient if an 
irreducible polynomial with low Hamming weight and few terms of high degree 
are chosen, such as a trinomial or a pentanomial. 

5.2 Field Multiplication and Inversion 

In this subsection, we analyze the performance of field operations. In our set- 
tings, as we have already pointed out, an element of the definition field can be 
represented as a single word integer on the computers that we used. In such a 
case, we do not require a multi-precision integer library. This fact contributes to 
faster implementations. 

We conducted an experiment to examine the performance of field multiplica- 
tions and field inversions in Fg, where q went up to 160 bits. We focused on the 
case where a field element could be represented as a single word integer on the 
CPUs, because we are interested in the hyperelliptic settings that can be imple- 
mented using only single word integers on CPUs. Many fast implementations of 
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elliptic curve cryptosystems have been developed and their efficiencies analyzed 
for fields of size approximately However, for fields of small size, where el- 
ements can be represented as a single word integer on a CPU, the performance 
of the field operations has not been reported in open literature. 

Tables 2 and 3 show the performance of our implementation on the Alpha 
and the Pentium, respectively. From our implementation, we can observe that: 

Over Fp 

Result 1 A field multiplication has a time complexity of c|"^], where 
m = log2P, w is the processor’s word size and c is some constant. 
Moreover, if ^ a multiplication can be much faster than the 

case 5 < ^ < 1. On the Alpha, a field multiplication takes 0.16 fisec 
when log2P < 32, 0.61 /rsec when 32 < logp < 64. On the Pentium, a 
field multiplication takes 0.15 fisec when log2P < 16, 0.28 fisec when 
16 < log2P < 32. 

Result 2 The speed of a field inversion grows linearly in logp. On the 
Alpha, a field inversion takes 0.1 log2P psec. On the Pentium, a field 
inversion takes 0.121og2P psec. 

Over F2» 

Result 3 The speed of a field multiplication grows linearly in n = log2 2". 

On the Alpha, a field multiplication takes 0.015n psec. On the Pen- 
tium, a field multiplication takes 0.04n psec. 

Result 4 The speed of a field inversion grows linearly in n. On the Alpha, 
a field inversion takes 0.2n psec. On the Pentium, a field inversion 
takes 0.6n psec. 

We should note that in Fp, field multiplications for fields whose size is smaller 
than half the CPU’s word length are faster than those for fields whose size is 
larger than half the CPU’s word length. The reason is that the field multiplica- 
tions can be computed with the instruction (a * &)mod p. Now, (a * b) can be 
computed in advance and is at most double the larger of a and b. Therefore, if 
a and b are larger than the CPU’s word size, then (a * b) would be larger than 
the word size. On a Pentium, we can perform this computation with the double 
word instruction, ((long long)a * (long long)&)%p in C language. As an Alpha 
does not have such a double word instruction, we must use some special tech- 
nique or assembly language. On both the Pentium and the Alpha, such double 
word computation is costly compared to single word computation. 



6 Enge’s Analysis on the Average Number of Field 
Operations 

In this section, we summarize the computational cost of the algorithm for the 
group law in Jacobians, which has been analyzed by Enge [En98]. To com- 
pare the efficiency of Jc(Fp) and Jc(F2~), it is important to analyze the av- 
erage number of field multiplications and inversions. Enge analyzed the average 
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Table 2. Timings of field operations on an Alpha 21164A (600MHz) in /rsec 
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complexity of the extended Euclidean algorithm on polynomials with heuristic 
assumptions. ^ 

Several cases need to be distinguished. In this paper, we will concentrate on 
the following two: Jc(Fp) and Jc(F2~) with h = 1. Table 4 shows the average 
number of field multiplications and inversions for composing two distinct reduced 
divisors. The proof of correctness can be found in [En98]. 

We assume that the composition algorithm has yielded a random semire- 
duced divisor div(a', b') of degree 2g, a and b are almost uniformly distributed 
over all polynomials of degree 2g and 2^—1, respectively. Table 5 shows the 
average number of field multiplications and inversions for the reduction step us- 
ing Lagrange reduction. The proof of correctness can be found in [En 98]. We 
describe the case that the curves have genus 5 > 3, because we will focus on the 
case that the operations in the Jacobian can be implemented with a single word 
size integer on the CPUs. 

According to the above tables, the number of field operations required for 
a full addition and doubling step for a genus g > 2> curve can be estimated as 
given in Table 6. 



® Heuristic: [En98] Let the hyperelliptic curve C of fixed genus g be chosen randomly 
according to a uniform distribution on the defining pairs of suitable polynomials 
(h,f). If div(a, 6) is a uniformly selected element of Jc(Fq), then we can assume 
that a varies uniformly over all polynomials of degree g and b over all polynomials of 
degree p — 1. Likewise, if div(ai, fei) and div(fl2, 62) are two uniformly selected points 
of Jc'(Fq), then we can assume that ai and 02 vary uniformly and independently 
over all polynomials of degree g and fei and 62 over all polynomials of degree g — 1. 
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Table 3. Timings of field operations on a Pentium II (300MHz) in /rsec 



word size 


log2<? 


multiplication 




MM 


F2~ 


MM 




1/2 


5 


BlPl 


0.20 


rn^ 




10 


mrei 


0.41 


|gg 




15 


roro 


0.62 


il^ 


BIIIII 


1 


20 


mga 


0.82 




im 


25 




1.03 


HKH 


iHEl 


30 




1.23 






2 


40 




60 


Bill 




60 


Qg 


88 






3 


80 




128 


Qg 




4 


100 


ggRl 


186 


Qg 




120 


ggg 


206 


1^ 




5 


140 




253 


^1 




160 




300 







Table 4. Average number of field operations in composition step for Jc(Fp) 
and Jc(F 2 ") with h = 1 





1 Addition 


1 Doubling | 


mul. 


inv. 


mul. 


inv. 


Jc(Fp) 


85 " + 5fl - 2 


g + 2 


^g^ + ^9 


5 + 1 


Jc(F2") 


7 9^ + 7g 


<7 + 1 


Ag + 2 


1 



7 Our Implementation and Comparisons of the Group 
Operations in the Jacobians 

In this section, we will show our implementation of the group operations: an 
addition, a doubling and a scalar multiplication in the Jacobians. Moreover, we 
will compare their respective efficiencies. 

7.1 The Average Number of Field Operations in a Scalar 
Multiplication 

We will now describe the expected speed of adding two elements and doubling 
one element in a Jacobian. Using the average number of field operations in 



Table 5. Average number of field operations for Lagrange reduction step for 
genus larger than or equal to 3 





1 multiplications 


mmmmi 


Jc(Fp), g even 




mmam 


Jc(Fp), g odd 


1 ^g^ -9 




Jc(F 2 »), g even 




ho +7 1 


Jc(F 2 »), g odd 


1 7<f - 2g 


wmM 
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Table 6. Average number of field operations for a full addition and doubling 





1 Addition 


1 Doubling | 


multiplications 


inversions 


multiplications 


inversions 


Jc(Fp), g even 


17g^ +35-3 




fg^ + ^g-1 


§5 + 2 


Jc(Fp), g odd 


173" +45-2 


iTT 


^ 9 ^ + ¥9 


2 5+1 


Jc(F 2 ~), g even 


143" + 4 p - 1 




75 " +5 + 1 




Jc(F 2 '>), g odd 




29+1 


+ 25 + 2 


2ff+ § 



terms of the genus g shown in Table 6, we can evaluate the speed of operations 
in Jacobians of fixed genus curves. 

Enge, in [En 98], analyzed the average bit complexity of an addition and a 
doubling. He examined three basic situations, in which the complexity of field 
operations is either constant or grows with logq or log^q. In these settings, a 
comparison of the efficiency of Jc(Fp) versus Jc(F 2 >>) was made. He concluded 
that Jc(F 2 ») is faster than Jc(Fp) for the simple reason that a doubling in 
Jc (Fp) is costly compared to a doubling in Jc(F 2 ~). 

However, in real computations, the speed of an adding and a doubling in 
a Jacobian depends on the real speed of field operations. Thus, although the 
number of field operations in an addition and a doubling for Jc(Fp) is larger 
than in Jc(F 2 n), if a field multiplication and a field inversion in Fp are faster 
than in F 2 >», then Jc(Fp) may be faster than Jc(F 2 ~). 

To compare the efficiencies, we introduce here the symbols M and I, denoting 
the real speed of a field multiplication and the real speed of a field inversion, 
respectively. Table 7 shows the expected speed of group operations in a Jacobian. 
The subscripts p and 2" denote the symbols over Fp and F 2 ~ , respectively. Note 
that a Jacobian has order approximately . Therefore, a curve of large genus 
has small M and I. For a scalar multiplication, a randomly chosen element is 
multiplied by a 160-bit integer using a simple ^^binary method' ®. 



7.2 The Average Number of Field Operations in Our 
Implementation 

Table 8 shows the average number of field multiplications and inversions for an 
addition and a doubling in a Jacobian based on our implementation. The number 
of field multiplications grows on the order of 0{g'^). In a doubling, the number of 
multiplications for Jc(F 2 ~) is much larger than for Jc(Fp). These facts support 
Enge’s analysis. The reason that the numbers in Table 8 differ slightly from 
Enge’s analysis is that in our implementation, as soon as a reduced divisor has 

® There are a number of other optimization techniques available that were not used 
in our implementation. A signed binary window method would be more useful for 
making an elliptic curve and hyperelliptic curve exponentiation fast. However, we 
need an additional computer memory for a window method. Therefore, the simple 
binary method is still used for a practical application such as a smart card, which 
has restricted hardware resources. 
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Table 7. Expected speed, where M and I denote the speed of a field multipli- 
cation and a field inversion, respectively 





Jc(F,) 


Addition 


Doubling 


160-bit scalar multiplication 




Jc(Fp) 


163Mp -t 8/p 


168Mp -k 7/p 


39920Mp -k 1760/p 


1 


Jc(F2" ) 


137A/2~ + 7/2" 


44A^2’^ 4“ 5 / 2 ^ 


I 8 OOOM 2 " -k 1360/2" 


1 


Jc(Fp) 


28 1 Mp -t 9/p 


285Mp -k 8 /p 


68080Mp -k 2000 /p 


1 


Jc(F 2 n) 


239M2" -t 8 / 2 " 


117A/2" -k 4 / 2 " 


37840A/2" -k 1280/2" 




Jc(Fp) 


443Mp -t 9.5/p 


445Mp -k 8.5/p 


106640Mp -k 2120/p 


g 


Jc(F 2 ") 


375A/2" -k 10 / 2 " 


187A/2" -k 5 / 2 " 


5992 OM 2 " -k 1600/2" 


g 


Jc(Fp) 


627Mp -k 12/p 


628Mp -k ll/p 


150640Mp -k 2720/p 


g 


Jc(F 2 " ) 


527 A/2" -k 11/2" 


259A/2" -k 5 / 2 " 


83600A/2" -k 1680/2" 



been obtained, we divide all coefficients of a (a polynomial of div(a,&)) by its 
leading coefficient, so that a becomes monic. 



Table 8. Average number of field operations in an addition and a doubling with 
Lagrange reduction 



1 


F, 


/( m ) 


1 Addition 


1 Doubling | 










1 


Fp,(log 2 P = 60) 


u‘ 


402 


m 


379 


m 


s 


Fjsg 


u‘ 


370 


m 


202 


m 


g 


F 241 


u’’ -k w’ -k W'’’ -k 1 


631 


m 


341 


H 


1 


F 231 


u" -ku” -k 1 


iiisa 


■Q 


587 


■1 


i 


Fp,(log2P= 29) 




1591 


■a 


1513 


■a 




F 229 


+ u' + u-^ + 1 


1475 


■a 


835 


■1 



7.3 Expected Speed 

In standard discrete logarithm-based cryptographic protocols, such as DSA and 
ElGamal signature and verification variants, multiplications by large integers are 
performed. Therefore, the total performance for cryptographic protocols can be 
evaluated by the speed of scalar multiplications in a Jacobian. 

Let 'imult and \^inv denote the number of field multiplications and inversions 
in a scalar multiplication, respectively. The efficiency of a scalar multiplication 
S can be formulated as follows. 

S = ^mult ■ M + ttmu • I 

It should be remarked that ‘i'lnult and jjmu depend not only on the size of 
the field but also on the field’s characteristic, as we have seen in the previous 
section. Table 9 shows the efficiency of the Jacobians. In the table, ^mult and 
jjmu are followed by Enge’s analysis on the average number. From this table, we 
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can observe a few facts. Note that some of these facts are contrary to Enge’s 
analysis. The details will be discussed in a later subsection. 



Table 9. Expected Performance 



Platform 


9 


F, 


word size 


M(/isec) 


/(/isec) 


Hmwit 


Hinu 


S{jj,sec) 


Alpha 


3 


Fp, logjP = 60 


1 


0.61 


7.83 


4.0 • 10“ 


1.8- 10^ 


3.84 • 10“ 


Fjsg 


1 


0.72 


14.3 


1.8 • 10“ 


1.4- 10“ 


3.29 • 10“ 


6 


Fp, logjP = 29 


1/2 


0.16 


2.11 


15.1 • 10“ 


2.7- 10“ 


2.98 • 10“ 


F 229 


1/2 


0.44 


5.50 


8.3 • 10“ 


1.7- 10“ 


4.59 • 10“ 


Pentium 


6 


Fp, logjP = 29 


1 


0.28 


3.18 


15.1 • 10“ 


2.7- 10^ 


5.09 • 10“ 


F 229 


1/2 


1.23 


20.1 


8.3 • 10“ 


1.7- 10“ 


13.6 • 10“ 



7.4 Timings in Jc(Fp) and Jc(F 2 ") 

Next, we show the timings of an addition, a doubling and a scalar multiplication 
in Jc(Fp) and Jc(F 2 ") in our implementation. For a scalar multiplication, a 
randomly chosen element in the Jacobian was multiplied by a large integer. 
In standard discrete logarithm-based cryptographic protocols, such as DSA and 
ElGamal signature and verification variants, multiplications by large integers are 
performed. In such a case, the integer has the size of the order of the subgroup 
of a finite abelian group (in our case, Jc(Fg)). The abelian groups given in 
this paper have subgroups of order approximately though these orders 

are different from each other. In our implementation, we multiplied a randomly 
chosen element from each of the Jacobians by a 160-bit integer, to compare 
performance. 

The timings for hyperelliptic Jacobians, given in Tables 10 and 11, were 
obtained from implementations on a Pentium II (300MHz) and on an Alpha 
21164A (600MHz). We used a simple “binary method’ for scalar multiplication. 
Programs were written mainly in C language and compiled by gcc 2.8.1 or DEC C 
V5.8 with maximal optimization. Only in the implementation of multiplications 
in Jc(Fp), with log 2 P = 60, of the genus 3 curve on the Alpha, did we use 
assembly language, since we could not compute a 128-bit integer directly with 
C instructions. 



Table 10. Timings of Jc(Fp) (Alpha 21164A (600MHz), Pentium II (300MHz)) 



1 


F, 


/(w) 


Add. (msec.) 


Dbl.(msec.) 


Scalar (msec.) | 




Pentium 




Pentium 




Pentium 


s 


Fp,(log 2 P = 60) 


OBI 


0.39 


— 


0.38 






— 




Fp,(log 2 P = 29) 


□HI 


0.28 


0.83 


0.26 


0.80 


66 


189 
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Table 11. Timings of Jc(F 2 n) (Alpha 21164A(600MHz), Pentium II(300MHz)) 



1 


F, 


/(m) 


1 Add. (msec.) 


Dbl.(msec.) 


Scalar(msec.) | 




Pentium 




Pentium 




Pentium 


s 




u‘ 


0.30 


— 


0.09 


— 


40 


— 


El 




m” + m' + «■” + 1 


0.30 


— 


0.10 


— 


43 


— 


S 




u" +u” + 1 


0.34 


1.40 


0.10 


0.48 


46 


182 




F229 


+ m" + u' + «■” + 1 


0.47 


1.76 


0.13 


0.56 


61 


227 



7.5 Efficiency of a Jacobian and Comparison 

From Table 10 and 11, we can observe the following facts: 

Comparison on Genus 

Fact 1: For the case Jc(Fp) on the Alpha, we can see that even though the 
genus is larger, the scalar multiplication for the genus 6 curve Jc(Fp), 
with log 2 P = 29, is faster than that of the genus 3 curve Jc(Fp), with 
log2P = 60. 

Fact 2: For the case Jc(F 2 '>), the timing of the scalar multiplication grows 
as the genus grows, both on the Alpha and on the Pentium. 



Comparison on the Characteristic of the Field 

Fact 3: On the Alpha, for genus 3 curves, Jc(F25g) is much faster than 
Jc(Fp), with log 2 P = 60. 

Fact 4: On the Alpha, for genus 3 curves, Jc(F 22 g) is slightly faster than 
Jc(Fp), with log 2 P = 29. 

Fact 5: On the Pentium, for genus 6 curves, Jc(Fp), with log 2 P = 29, is 
slightly faster than Jc(F229). 

It should be noted that Fact 1 and 5 differ from Enge’s conclusion [En 98], 
which was based on his theoretical analysis. These efficiencies can be evaluated 
based on the following points. 

1. The number of field multiplications and inversions in a scalar mul- 
tiplication: 

The number depends on the genus of the curve and on the characteristic of 
the definition field F^. 

2. The efficiency of field operations: 

The efficiency depends on the size of the fields log 2 q, the word size of the 
fields on the CPU and on the properties of the CPU’s architecture. 



The properties of a CPU’s architecture can affect the performance of field 
operations. In the Alpha 21164A processor, integer division does not exist as 
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hardware opcode. ^ For the field multiplication of the genus 3 curve Jc(Fp), 
with log 2 P = 60, on an Alpha, we need to divide a double word integer by a 
single word integer in our implementation. This operation is costly since this in- 
struction is done via a software subroutine [DEC] . For the genus 6 curve Jc(Fp), 
with log 2 p = 29, an element of the field can be represented as a half-size inte- 
ger on an Alpha. Although, it still needs a software subroutine for division, a 
field multiplication is inexpensive compared to the genus 3 curve. In fact, our 
implementation of field operations, as shown in Table 2, shows that a field multi- 
plication in Fp, where log 2 P = 29, is 4 times faster than in Fp, where log 2 P = 60. 
On the other hand, on a Pentium, a field multiplication in Fp, where log 2 P = 15, 
is only 2 times faster than in Fp, where log 2 P = 30. 
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A The Security of DLP in a Jacobian 



We have to choose Jacobians to satisfy the following four conditions to resist all known 
attacks. 

Cl : General Algorithms 

A condition Cl is needed to resist the Pohlig-Hellman method [PH78]. This algorithm 
has a running time that is proportional to the square root of the largest prime factor 
of ttJc(Fq). Therefore, we need to choose curves such that ttJc(Fg) has a large prime 
factor. 

C2 : Imbedding into a Small Finite Field 

A condition C2 is needed to resist Frey and Ruck’s generalization [FR94] of the 
MOV-attack [MOV93] using the Tate pairing. This method reduces the logarithm 
problem over Jc(Fg) to an equivalent logarithm problem in the multiplicative group 
F*fc of an extension field F^k . Ways to avoid the MOV-attack have been discussed 
in [BS91,CTT94]. We take a similar approach by choosing curves such that the in- 
duced Jacobian Jc'(Fg) cannot be imbedded via the Tate pairing into F*;, with small 
extension degree k. 

C3 : Large Genus Hyperelliptic Curves 

A condition C3 is needed to resist the Adleman-DeMarrais- Huang method [ADH94]. 
They found a sub-exponential algorithm for discrete logarithms over the rational sub- 
group of the Jacobians of large genus hyperelliptic curves over finite fields. It is a 
heuristic algorithm under certain assumptions. Therefore, we need to choose curves 
such that the genus is not overly large. 

C4 : Additive Embedding Attack 

The condition C4 is to resist Ruck’s generalization [Ru97] of the Semaev-Smart-Satoh- 
Araki attack [Se98,Sm 97,SA97] on elliptic cryptosystems with Frobenius trace one. The 
method uses an additive version of the Tate pairing to solve the discrete logarithm of 
a Jacobian over a finite field of characteristic p and has a running time of O(n^logp) 
for a Jacobian with cyclic group structure of order p". 
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B Our Order Counting Method 

B.l Over a Characteristic Two Finite Field 

Throughout this subsection, F denotes an algebraic function field of genus g whose con- 
stant field is the finite field F, and P denotes the set of places of F/K. The definition, 
the theorem and the corollary shown below are given in the article [St93]. 

Definition 1. [St 93] The polynomial L{t) := (1 — t)(l — qt)Z{t) is called the L- 
polynomial of the function field F/Fq, where Z{t) denotes the zeta-function of F/Fq. 



Theorem 1. |St93/ 

(a) L{t) e Z[t] and deg L{t) = 2g 

(b) L{t) = qH^^’Lil/qt) 

(c) -1/(1) = h, the class number of F/Fq 

(d) We write L{t) — aW . Then the following holds: 

(1) ao = 1 and 02 g = q® . 

(2) a 2 g-i = for 0 <i < g. 

(3) ai — N — {q -\- 1) where N is the number of places P € Pf of degree one. 

(e) L{t) factors in C[t] in the form L{t) — “ cnt). The complex numbers 

are algebraic integers, and they can be arranged in such a way that 
aiag+i = q holds for i = 1,- ■ ■ ,g. 

(f) If Lr{f) := (1 — t)(l — q^t)Zr{t) denotes the L-polynomial of the constant field 
extension Fr = FFqr, then Lr{t) — Oi=i(l “ 



Corollary 1. |St93/ Let Sr '■= Nr — {q^ + 1). Then we have: 
ao = 1, and ia: = SiOo + Si-iai H + Sitti-i , for i = l,--- ,g. 

We can determine the order of Jacobians by the Theorem and the Corollary in the 
following algorithm. It should be noted that it is easy to count N\,- ■ ■ ,Ng if F, is 
small. 



B.2 Over a Prime Finite Field 

Let n = 2p -f 1 be an odd prime, and let p = 1 (mod n). The order of a Jacobian over 
Fp of the curve of the form +v = u" can be found as follows [BK97,Ko98]: 
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Order Counting 

Input : Hyperelliptic cnrve C : v^‘ + h{u)v = /(w) over Fq and extension degree n 
Output: The order ttJc'(Fgn) 

1 . Determine Nr = ttJc(F 5 »-), for r = 1, • • • 

by counting the number of rational points of C over F,*- 

2 . Determine the coefficients of Lp (t) = following: 

tto = 1 

for 1 < i < p: Ui = ~ ('?*' + l))ai-fc)/* 

ioT g + 1 < i < 2g: at — q^~^a 2 g-i 

3. Compute LF,n(l) = Jlfc-i where ^ runs over the n-th root of unity 

4. Return Pc(Fq~) = Lp (1) 

q” 



Let ^ and let a £ Fp be a fixed non- nth-power. There is a nniqne 

multiplicative map x on F*p snch that x(a) = C- We extend this character x to Fp by 
setting x(0) — 0- The Jacobi sum of the character x with itself is defined as follows: 

J(x,x) = x(y)x(i - y) 

ysFp 

For 1 < i < n — 1 let CTi be the automorphism of the field Q(C) such that CTi(C) = C- 
Then the number of points on the curve of the form v'^ + v = u", including the point 
at infinity, is equal to 



n — 1 

M = p + l + '^ai{J{x,x)) 

1 = 1 

The number N of points on the Jacobian of the curve is equal to 

n — 1 

iV = n ^^(J(X,X) + 1) = N(J(x,x) + 1) 



where N denotes the norm of an algebraic number. Using the LLL-algorithm [LLL82], 
it is possible to determine J(x>x) in polynomial time. 
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Abstract. It is often required in many elliptic curve cryptosystems to 
compute hG for a fixed point G and a random integer h. In this paper 
we present improved algorithms for such ehiptic scalar multiphcation. 
Implementation results on Pentium II and Alpha 21164 microprocessors 
are also provided to demonstrate the presented improvements in actual 
implementations. 



1 Introduction 

Let E be an elliptic curve defined over a finite field F {F — GF(2”) or GF(p”) 
for a prime p). Let G be a point of prime order in E. Elliptic scalar multiplication 
is to compute kG for random k. The performance of elliptic curve cryptosystems 
mainly depends on how efficiently this scalar multiplication can be performed. 
If G is random, the signed window algorithm is the most preferred algorithm for 
general scalar multiplication (e.g., see [6, 7]). If E is defined over a small subfield 
such as GF(2’’) with r|n or GF(p) for n > 1, then general scalar multiplication 
can be performed much faster using Frobenius expansion [10, 15, 18, 16,4,9]. 

On the other hand, it is often required in elliptic curve cryptosystems to 
compute kG for a fixed point G. Since G is now fixed, we can substantially 
speed up the computation of kG using a precomputed table. Several methods 
have been developed for fast exponentiation using precomputation over a generic 
group [3,17,11], which can thus be applied equally well to the elliptic curve 
group. Among them, the Lim-Lee algorithm (LL algorithm, for short) is known 
to provide higher efficiency and flexibility in time-storage tradeoffs. 

In this paper we investigate further improvements of the LL algorithm for 
elliptic scalar multiplication. Note that field inversion is most expensive among 
field operations required for elliptic curve arithmetic in most interesting fields. 
So, we tried to reduce the number of field inversions, at the cost of more field 
multiplications, utilizing the parallelizability of the LL algorithm and the simul- 
taneous inversion technique [5, Algorithm 10.3.4]. Obviously, the amount of im- 
provement that can be achieved with the resulting algorithm. Algorithm LL-SA, 
depends on the cost ratio of field inversion to multiplication. Our implementa- 
tions on Pentium II and Alpha 21164 show that Algorithm LL-SA achieves about 
20% speed-up over Algorithm LL in most interesting fields. Further improvement 
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can be obtained by computing many scalar multiples in parallel. This simultane- 
ous scalar multiplication algorithm, Algorithm LL-SM, may be useful for heavy 
loaded security servers, which often need to process hundreds of transactions 
(requiring scalar multiplications) at a time. We also show that these algorithms 
can be used to speed up general scalar multiplication using Frobenius expansion. 

This paper is organized as follows. In section 2 we briefly summarize elliptic 
curve arithmetic (with some improvements) in GF(2”) and GF(p”). We then 
present improvements of the LL algorithm using simultaneous elliptic addition 
(Algorithm LL-SA) and simultaneous elliptic scalar multiplication (Algorithm 
LL-SM) in sections 3 and 4, respectively. Section 5 deals with application of LL 
algorithms to speed up general scalar multiplication using Frobenius expansion in 
GF(p”). Finally we present our implementation results in section 6 and conclude 
in section 7. 

2 Elliptic Curve Arithmetic in Finite Fields 

2.1 Affine Coordinates 

A non-supersingular elliptic curve defined over a finite field T is a set of points 
[x, y) given by the cubic equation 

-\- xy — x'^ + ax"^ + b {a,b E F, b ^ 0) if char(T) = 2, 

y^ — x^ + ax + b [a, b E F, 4a^ -|- 276® ^ 0) if char(T) > 3, 

together with a ‘point at infinity’. Addition/doubling formulas in this affine 
representation are summarized in Table 1. 



field 


operation 


X 


addition formulas 


GF(2") 


addition (Ae) 
(ro ^ xi) 


\ yi+yo 


X 2 — -|- A -|- iTo “h “h ^ 

y2 = \{xo + X 2 ) + X2 + U0 


doubhng (Z>e) 
(ro = xi) 


A = ITO + ^ 

^ ' Xq 


X 2 — -|- A -|- 

y2 = X{xo + X 2 ) + X2 + U0 


GF(p") 


addition (Ae) 
(ro ^ xi) 


\ yi-yo 

Xi —Xo 


X2 — X‘‘ — (ro -h ri) 
j /2 = A(ro - X 2 } - yo 


doubhng (Z>e) 
(ro = xi) 


, 3xn + a 

~ 2yo 


X 2 — ^ — 2xo 

j /2 = A(ro - X 2 ) - yo 



Table 1. Addition formulas in affine coordinates: (x 2 , j/ 2 ) = (*:o, yo) + (*:i, yi) 



2.2 Projective Coordinates 

There is another representation of points, the so-called (weighted) projective 
representation, which eliminates the expensive field inversion at the cost of more 
field multiplications. 
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GF(2"-). For conversions between afiine and projective coordinates, we used 
the transformation in [14]: x — y: J/ = To the best of our knowledge, this 
is the best known conversion rule for GF(2”). The resulting formulas for elliptic 
addition and doubling are given below d 

- Addition formula: {X 2 ,Y 2 ,Z 2 ) — (Xo,Yo,Zo) + {Xi,Yi, 1) 

A = Ao + XiZo Z2 = Cd X 2 = B^ + A^(C + aZ^) + BC 

B = Yo + YiZl C = AZo ^ T2 = {BC + Z 2 ){X 2 + X1Z2) + (Ai + Yi)Zl 

- Doubling formula: (A2, 12, ■Z'2) = 2 (Aq, Yo, -^o) 

Y2 = X^ Z^ , A2 = Ao" + bZl , Y2 = bZl ( A2 +Z 2 ) + X 2 {aZ 2 + Y" ) • 

The addition formula requires 9 (8 general, 1 constant) multiplications and 
5 squarings, while the doubling formula requires 5 (3 general, 2 constant) mul- 
tiplications and 5 squarings. Note that the above addition formula requires one 
less multiplications than the formula given in [14], If a = 0, then we can further 
reduce one multiplication in each formula. 

GF(p-). The addition/doubling formulas described here are essentially the 
same as those of the IEEE P1363 Draft [20]. The coordinates conversion is done 
hy X — y — 2 ^-^ So the alhne coordinate [x, y) should be mapped to the 
projective coordinate (A, Y, A) = {x,2y,l). The resulting formulas for elliptic 
addition/doubling are described below, where we only consider the special case 
of Ai = 1 as before. 

- Addition formula: (A 2 , Y 2 , A 2 ) = (Aq, Yq, Aq) -|- (Ai, Yi, 1) 

A = Ao + Ai Ao^ 5 = Ao - Ai Ao^ A2 = AoF, A2 = - AE^ 

C = Yo -h YiAo®, D = Yo-YiAo®, £' = 25 ^ Y2 ^ D{AE^ - 2X2) - E^BC 

- Doubling formula: (A 2 , Y 2 , A 2 ) = 2(Aq, Yq, Aq) 

A = 3Ao^ -h aZ^ Z 2 = Yo Ao, A 2 = A^ - 5 

B = 2 AoYq 2 , C = Yo* ^ Y2 = A{B - 2X2) - C 

The above formulas show that elliptic addition requires 8 multiplications and 
3 squarings, while elliptic doubling requires 4 (3 general, 1 constant) multipli- 
cations and 6 squarings. If a = —3, then the variable A in doubling can be 
computed by A = 3 (Aq -|- Aq )(Aq — Aq), so one can save 2 squarings in this case. 

* Here we only describe the special case of Ai = 1 for elliptic addition, which corre- 
sponds to the case where precomputation is done in affine coordinates in doubfe-and- 
add algorithms for scalar multiplication. This special case gives better performances 
in almost all cases. 

^ The factor 2 in ?/ is included to eliminate the modular division by 2 appearing in 
the addition formula when using y — (see A. 10. 5 in [20]). This also reduces the 
number of held additions/subtractions required in the doubling formula. Note that 
the addition/subtraction time in GF(p") is not negligible (see Sect. 6). 
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2.3 Performance and Preferred Coordinates 

In Table 2 we summarized the number of field operations for elliptic curve arith- 
metic in aiiine and projective coordinates. Here the capital letters /, M, S and 
A denote field operations of inversion, multiplication, squaring and addition, re- 
spectively. We assumed fixed values for constant a for performance reason: a — 0 
for GF(2”) and a — —3 for GF(p”). It should be noted that these special values 
for constant a do not place much restriction in the choice of elliptic curves, since 
the proportion of elliptic curves that can be rescaled to have the above values 
for constant a is approximately 1/2 for GF(2”) and 1/2 or 1/4, depending on 
the residue of p mod 4, for GF(p”) (see Appendix A in [20]). 



held 


coordinates 


doubhng (Z>e) 


addition (Ae) 


GF(2") 
(a = 0) 


Affine 


11 + 2M + IS + 5A 


11 + 2M + IS + 7 A 


Proj.(5i = 1) 


4M -h 55 -h 3A 


8M -h 55 -h 8A 


GF(p") 
(a = -3) 


Affine 


11 + 2M + 2S + 7 A 


11 + 2M + IS + 6A 


Proj.(5i = 1) 


AM + AS + 9A 


8M -h 35 -h 9A 



Table 2. The number of held operations for ehiptic addition/doubling 



To simplify performance comparisons, we will use the following assumptions 
on speed ratios between field operations throughout this paper: 15 = 0.15M, 
constant multiplication = 0.5M for GF(2”) and 15 = 0.8M, lA — 0.15M for 
GF(p”) (addition times in GF(2”) neglected). Of course, these ratios may vary 
from implementation to implementation, but our optimized implementations 
on P6 and Alpha microprocessors (see Sect. 6) show that in most interesting 
fields the above assumptions are reasonable enough for theoretical comparison 
of computational complexity. 

The cost ratio of field inversion to multiplication {I/M) is a key factor in 
determining a preferred coordinate system. So, let us find the I/M value at 
the break-even point between aiiine representation and projective representation 
(with Zi — 1). For this, suppose that r = No^/Na^ (i.e., r elliptic doublings 
are required for each elliptic addition in a scalar multiplication algorithm) . For 
example, we have r = 6 for the signed window algorithm with window size 4 and 
r < 1 for the LL algorithm. From Table 2 and the assumptions on speed ratios 
between field operations, we can obtain the following relations at the break-even 
point: 



I/M = 



2.60 forGF(2"), 
3.90 forGF(p"). 



Thus, for large r, it is almost always preferable to do elliptic scalar multiplication 
in projective coordinates. However, in the LL algorithm, we have 0.2 < r < 0.7 
for most interesting parameters, so 4.95 < I /M < 5.93 for GF(2") and 6.34 < 
I /M < 7.36 for GF(p”). Thus, as we will see later, aiiine coordinates may yield 
better performances than projective coordinates in the case of GF(p”). 
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3 The Improved LL Algorithm for Scalar Multiplication 

3.1 The Original LL Algorithm 

We briefly describe the Lim-Lee algorithm for elliptic scalar multiplication kG 
for a hxed point G and analyze its performance. First, the multiplier k oil bits 
is divided into hv subblocks of b bits as follows (see Figure 1): 

1 — 1 h — 1 / v — 1 \ 

k = ^ 2-, where 

u=0 i=0 J 

I a 

a—\-A, 6=1"— 1, kjj — e;-a+j&+t- 

t-O 



— 1 




fco,l 


ko,o 


^l,v — l 




fcl,l 


ki,o 










kh—l,v — l 




1,0 


1,0 



|<- 6 ^ 
a — S- 



Fig. 1. Partition of an /-bit multiplier k for the LL Algorithm 



In the (off-line) precomputation stage, we compute and store the point 
GG[/][j] as follows: 



Gi,j = 2*“+J^G for 0 < 1< /i and 0 < j < u, 

h—1 h—1 

GG[I]\j] — ^ CiGij for 0 < j < V, where I 2*6,-. (1) 

i—0 i—0 

Using these precomputed values, we can express kG as 



-1 h-l 



6-1 






j=0 i=0 



t=0 



' v — 1 h — 1 

EE Va+jb+tGi^j 

y j=0 i=0 



b—1 / v — 1 

= E2M^GG[/,y][j] 

t=0 \j=0 



ft-1 

where Ij^t - '^2'- Cia+jb+t- 
i-O 



( 2 ) 
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Algorithm LL 

T Epo GG[/,>_i][j]; 

for t := 6 — 2 to 0 step -1 
T := 2T; 

T:^T + J2'P^GG[I,,][jI 
return T; 



Note that corresponds to the t-th bit column of the j-th block column in 
Figure 1. Now, we can compute kG for each new value of k using equation (2) 
as shown in Algorithm LL. 

Let us count the number of additions/doublings required by Algorithm LL. 
Obviously, we only need (6— 1) doublings. For the number of additions required, 
we note that the number of GG[Ij^t\{j] to be added is at most a. Therefore, we 
can see that the total cost for the worst case is given by 



Gllw{ 1, h, v) — {a - l)Ae + (6 - l)L>e. 



Let q be the probability of a bit being zero (so the probability of being zero 
is g^). Then we can easily derive the expected number of additions/doublings as 

GhLa{l, h, v) — {a - q^{a + {ah - l){q~^ - 1)) - l)Ae + (6 - 1)0^. (3) 

For random k, we may assume that g = 1/2. In this cse, equation (6) becomes 

GLLail, h,v)^(^a-l- ^±&l0 ^ Ae + (6 - 1)0^. (4) 

It is also easy to see that Algorithm LL requires the storage for (2^ — l)v pre- 
computed points and that the cost for precomputation is given by 



Gllp{ 1, h, v) — v(2^ - h- l)Ae + b{hv - IjDg. 



Table 3 shows the average number of field inversions and multiplications, 
{Ni+Nm}, given by G^Lai^^^, h, v) for some selected parameters h and v, where 
we used the assumptions in Sect. 2. 3 to compute the equivalent number of field 
multiplications required for elliptic addition/doubling. In the case of projective 
coordinates, we also included the cost for coordinates conversion back to alfine 
coordinates. The last two columns of Table 3 show the I /M ratios at the break- 
even point between computations in alfine and projective coordinates. The ratios 
range from 5 to 6 in GF(2”) and from 6.5 to 8 in GF(p”). Our implementations 
on P6 and Alpha (see Sect. 6) show that actual I /M ratios are larger than 10 for 
elliptic curves in GF(2”) and in GF(p”) with small n, so projective coordinates 
are preferred for these cases. 
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conhg. 


storage 


Affine (Ni + Nm) Proj. 


1/M at B.E.P. 


h X V 


T— 1 

1 


GF(2") 


GF(p") 


GF(2") 


GF(p") 


GF(2") 


GF(p") 


2x2 


6 


98.0-1210.7 


98.0-1399.6 


1-1684.1 


1-11031 


4.88 


6.51 


2x4 


12 


78.0-1167.7 


78.0-1306.6 


1-1599.1 


1-1859.8 


5.60 


7.18 


3x2 


14 


72.0-1154.8 


72.0-1291.1 


1-1515.1 


1-1766.9 


5.08 


6.70 


3x4 


28 


59.0-1126.8 


59.0-1230.6 


1-1459.9 


1-1655.8 


5.74 


7.33 


4x2 


30 


55.5-1119.3 


55.5-1223.4 


1-1402.3 


1-1595.4 


5.19 


6.83 


4x4 


60 


45.5-1 97.8 


45.5-1176.9 


1-1359.8 


1-1509.9 


5.89 


7.48 


5x2 


62 


45.0-1 96.8 


45.0-1180.8 


1-1328.4 


1-1484.9 


5.26 


6.91 


5x4 


124 


37.0-1 79.5 


37.0-1143.5 


1-1294.4 


1-1416.5 


5.97 


7.58 


6x2 


126 


38.5-1 82.9 


38.5-1155.0 


1-1280.9 


1-1415.4 


5.27 


6.94 


6x4 


252 


31.5-1 67.8 


31.5-1122.4 


1-1251.2 


1-1355.6 


6.00 


7.63 


7x2 


254 


32.8-1 70.5 


32.8-1131.9 


1-1239.8 


1-1354.4 


5.32 


7.00 


7x4 


508 


26.8-1 57.6 


26.8-1104.0 


1-1214.3 


1-1303.1 


6.07 


7.72 


8x2 


510 


27.9-1 60.0 


27.9-1111.9 


1-1206.0 


1-1303.4 


5.42 


7.11 


8x4 


1020 


22.9-1 49.3 


22.9-1 88.6 


1-1184.7 


1-1260.6 


6.18 


7.85 



Table 3. Average performance of Algorithm LL for computing hG with |fc| = 160 



3.2 The Improved LL Algorithm 

Computation of multiple inverses modulo the same modulus can be substan- 
tially speeded up using Montgomery’s trick to parallel inversion [5, Algorithm 
10.3.4], For example, to compute inverses of A and B modulo p, we first com- 
pute C — mod p and then A~^ — CB mod p and B~^ — CA mod p. In 

general, this simultaneous inversion algorithm requires 1 inversion and 3(f — 1) 
multiplications mod p for f-simultaneous inversion. Therefore, from Table 2, we 
can see that f-simultaneous elliptic addition in GF(p") requires the computa- 
tional cost of (/ — 3M) -|- f(5M -|- 5 -|- 6A). Similarly, f-simultaneous elliptic 
addition in GF(2”) requires (/ — 3M) -|- f(5M -|- 5 -|- 7A). This technique thus 
enables us to replace one field inversion by about 3 field multiplications for large 
t. The resulting cost savings are substantial, since field inversion costs more than 
3 field multiplications in most interesting fields. 

Now, let us consider how to achieve a maximal improvement of Algorithm 
LL using the simultaneous addition algorithm. First note that in Algorithm LL 
we may precompute and store the following b points ahead of time: 

V — l 

= for0<f<6-l. (5) 

i=o 

Then, we can just add one point GGG[t] to T in the Gth iteration of the for- 
loop (see Algorithm LL-SA). Thanks to the high degree of parallelism existing 
in equation (5), we can take much advantage of simultaneous inversion in this 
on-line precomputation stage. 
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Algorithm LL-SA 
for t := 0 to 6 — 1 step 1 

GGG[t] := Epo G'G[/,,,][j]; 
T := GGG[b - 1]; 
for t := 6 — 2 to 0 step -1 
T := 2T; 

T :=T + GGG[t]; 
return T; 



A naive way to evaluate equation (5) is to iterate 6-simultaneous elliptic 
addition v — 1 times (so, v — 1 inversions required). However, the number of 
inversions required can be further reduced by performing 6-simultaneous elliptic 
additions in parallel (based on the binary tree structure). E.g., if v = 4, we do 
the computation as follows: 

1. GGG[t] = GG[Io,t][0] + GG[/iy][l] for 0 < f < 6 - 1 and 
TTT[t] = GGih’tM + GG[h’tM for 0 < f < 6 - 1. 

2. GGG[t] = GGG[t] + TTT[t] for 0 < f < 6 - 1. 

This way we can reduce the number of inversions from v — 1 to [log 2 v] . This 
method of course increases the requirement for temporary storage from 6 to [|J6. 

Suppose that c elliptic additions are required for the on-line precomputation 
of GGG[tys. This requires field operations given by 



rriog2t^l(/-3M) + c(5M + 5 + 7A) 

1 [log 2 v] (/ - 3M ) -h c(5M + S + 6A) 

Since c elliptic additions in Algorithm LL are now replaced by Gsa(c) in Algo- 
rithm LL-SA, we can obtain the cost of Algorithm LL-SA as 

GLL-SA{l,h,v) — GLh{l,h,v) - AG{c), where ZlG(c) = cAg - Gsa(c). (7) 



for GF(2"), 
for GF(p”). 



Thus we only need to find the average and worst case values of c, Ca and Cyj . 

In the worst case, we need a — b additions in the precomputation stage, so we 
have Cyj — a — b. Considering the probability of GG[/][j] being ‘point at infinity’, 
we can find the expected value Ca as 

Ca — a — b — q^{a + [ah — — 1)) + S, where 

J = q'^-[b + [ah - l)[q-^ - 1) + [bv - a)[q-'^ - 1)). (8) 



As before, assuming that q — 1/2, we can simplify equation (8) to 



— a — b — 



where i ~ 0 +jl>» ~ »)(2'‘ ~ D 



2 >^ 



2hv 
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The cost advantage AC of Algorithm LL-SA over Algorithm LL can be ex- 
pressed in terms of held operations as follows. 

Affine : AC{c) = (c — |"log 2 «;])(/ — 3M), 

Pro! • AC(c) = / + riog2 ^1)^ + dcS" + cA - [loga v'\I for GF(2"), (10) 

( 3(c + |"log 2 v\)M + 2cS + 3cA — |"log 2 v\I for GF(p"). 

We evaluated the average performance of Algorithm LL-SA using equation 
(10) and Table 3. The result is shown in Table 4. Obviously, the amount of 
improvement of Algorithm LL-SA over Algorithm LL depends on the I /M ratio 
and becomes larger as the I/M ratio increases, since the improvement comes 
from the replacement of held inversions in Algorithm LL with about 3 held 
multiplications in Algorithm LL-SA. The I/M ratios at the break-even point 
shown in the last two columns can be used to determine which coordinates are 
preferred for the implementation of Algorithm LL-SA. From Tables 3 and 4 and 
the measured I/M ratios (Table 8 in Sect. 6), we can see that the amount of 
improvement can be about 10 to 25% for projective coordinates and about 5 to 
40% for aihne coordinates. 



conhg. 


storage 

(temp.) 


Affine (Aj -|- Nm) Proj. 


i/M at B.E.P. 


h X V 


GF(2") 


GF(p") 


GF(2") 


GF(p") 


GF(2") 


GF(p") 


2x2 


6(40) 


76.5-H275.2 


76.5-7464.1 


2-7600.1 


2-7914.2 


4.36 


6.04 


2x4 


12(40) 


39.7-H282.6 


39.7-7421.6 


3-7448.0 


3-7650.2 


4.51 


6.23 


3x2 


14(27) 


52.1-H214.5 


52.1-7350.8 


2-7436.9 


2-7658.3 


4.44 


6.14 


3x4 


28(28) 


27.8-H220.6 


27.8-7324.4 


3-7334.3 


3-7481.9 


4.59 


6.36 


4x2 


30(20) 


38.7-H169.8 


38.7-7273.8 


2-7335.1 


2-7502.5 


4.51 


6.23 


4x4 


60(20) 


20.0-H174.4 


20.0-7253.5 


3-7254.6 


3-7364.9 


4.73 


6.56 


5x2 


62(16) 


30.9-H139.1 


30.9-7223.1 


2-7270.9 


2-7405.5 


4.57 


6.31 


5x4 


124(16) 


16.0-H142.6 


16.0-7206.6 


3-7205.5 


3-7294.2 


4.85 


6.75 


6x2 


126(14) 


26.7-7118.4 


26.7-7190.5 


2-7231.7 


2-7347.5 


4.59 


6.36 


6x4 


252(14) 


13.9-7120.7 


13.9-7175.3 


3-7174.6 


3-7250.5 


4.93 


6.88 


7x2 


254(12) 


22.7-7100.8 


22.7-7162.1 


2-7196.9 


2-7295.5 


4.63 


6.43 


7x4 


508(12) 


11.9-7102.3 


11.9-7148.6 


3-7147.5 


3-7211.9 


5.06 


7.09 


8x2 


510(10) 


19.0-7 86.8 


19.0-7138.7 


2-7167.2 


2-7250.2 


4.73 


6.57 


8x4 


1020(10) 


10.0-7 88.1 


10.0-7127.4 


3-7125.0 


3-7179.3 


5.28 


7.41 



Table 4. Average performance of Algorithm LL-SA for computing hG with |fc| = 160 



4 Simultaneous Scalar Multiplication 

A central security server often needs to handle thousands of transactions, e.g., 
involving Diihe-Hellman key exchanges or digital signatures, at a peak time. For 
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such a heavy loaded application, further speed up can be obtained by computing 
many scalar multiples simultaneously. 

Suppose that we want to evaluate t scalar multiplications, k{G (0 < t < 
t, \k{\ — 1), at a time. We can then perform t instances of Algorithm LL-SA 
simultaneously, one for each kiG. Let us call this algorithm as Algorithm LL- 
SM. Then the simultaneous inversion technique can be applied even to double- 
and-add parts of concurrent Algorithm LL-SA instances. We thus perform all 
elliptic curve arithmetic in aiiine coordinates, but the number of field inversions 
required for t scalar multiplications is reduced to about [log 2 u] + 2b — 2. 

From the analysis of Sect. 3, we can easily see that the average performance 
of Algorithm LL-SM (assuming that g = 1/2) is given by 



GLL-SMa{t, I, h, v) - (|'log 2 v] + 2b - S - 2){I - 3M) -h 

Ag+t{b-l)Dg, (11) 



where J is the same as before (equation (9)) and Ag and Dg are given by 



(5M+1S + 7A for GF(2"), 
\bM +1S + 6A for GF(p"). 
(7>M + 1S + 7>A for GF(2"), 
\bM + 2S + 7A for GF(p"). 



( 12 ) 



Therefore, for large t, the cost of Algorithm LL-SM per scalar multiplication, 
i.e., GLL-SMa{t,l,f^}V)/t, is almost the same as the cost of Algorithm LL in 
aiiine coordinates with field inversion replaced by 3 field multiplications. This 
would be the best achievable performance per scalar multiplication, as far as 
field inversion is more expensive than 3 field multiplications. For example, we 
tabulated in Table 5 the cost of Algorithm LL-SM per scalar multiplication for 
t — 100. As can be seen from the table, the number of field inversions required 
per scalar multiplication is less than 1 for large t. We can also see that Algorithm 
LL-SM improves over Algorithm LL-SA by more than 15% under the reasonable 
assumption of I /M ratios (see Table 8 in Sect. 6). 



5 Speeding Up Scalar Multiplication Using </>-Expansion 

We can view an elliptic curve E defined over GF(p) as an elliptic curve defined 
over GF(p”). For such a subfield curve we can achieve a much higher performance 
using Frobenius expansion [9]. Let P — {x,y) be a GF(p”)-point on E. The 
Frobenius map <f> is defined as </ : [x,y) ^ [x^ , yP) and satisfies the equation 

(f>^ —t(f> + p — b) and (fP — 1, —^^/p <t< 2y^. (13) 

This map can be evaluated only using 2{n — 1) multiplications mod p (see [9]). 




112 Chae Hooii Lim and Hyo Sun Hwang 



conhguration 


storage 




(Ni + NM)/t 


h X V 


(a, 6) 


perm/temp 


Ca 


GF(2") 


GF(p") 


2x2 


(80,40) 


6/4000 


22.5 


0.77-H502.4 


0.77-7691.4 


2x4 


(80,20) 


12/4000 


40.3 


0.40-H400.5 


0.40-7539.5 


3x2 


(54,27) 


14/2700 


20.9 


0.52-H369.2 


0.52-7505.5 


3x4 


(54,14) 


28/2800 


33.2 


0.28-H303.0 


0.28-7406.8 


4x2 


(40,20) 


30/2000 


17.8 


0.39-H284.7 


0.39-7388.7 


4x4 


(40,10) 


60/2000 


27.5 


0.20-H233.7 


0.20-7312.8 


5x2 


(32,16) 


62/1600 


15.1 


0.31-H230.8 


0.31-7314.8 


5x4 


(32, 8) 


124/1600 


23.0 


0.16-H190.1 


0.16-7254.1 


6x2 


(27,14) 


126/1400 


12.9 


0.27-H197.7 


0.27-7269.8 


6x4 


(27, 7) 


252/1400 


19.6 


0.14-H162.0 


0.14-7216.6 


7x2 


(23,12) 


254/1200 


11.1 


0.23-H168.3 


0.23-7229.6 


7x4 


(23, 6) 


508/1200 


16.9 


0.12-H137.7 


0.12-7184.0 


8x2 


(20,10) 


510/1000 


9.9 


0.19-H143.2 


0.19-7195.1 


8x4 


(20, 5) 


1020/1000 


14.9 


0.10-7117.7 


0.10-7157.1 



Table 5. Average performance of Algorithm LL-SM for t — 100 and \ki \ — 160 



To compute kP, we first express the multiplier k using equation (13) as 

n — l 

k — '^ki<p\ where |^i| < (14) 

i-O 

precompute n points Pi — (P) for 0 < i < n and then compute kP as 

n — l 

kP^Y,^iPi- (15) 

i-0 

Note that the bit-length of fej ’s in equation (14) can always be made one bit less 
than the bit-length m of p (|p| = m) and that the negative signs can be absorbed 
into the precomputed points -Pi’s. So, we may assume that the coefficients fei’s 
in equation (15) are always positive integers of bit-length m — 1. 

The main source of efficiency in this scalar multiplication using base-</) ex- 
pansion is that the intermediate points .Pi’s can be evaluated almost free, only 
using 2(n — 1)^ subfield multiplications, and thus about elliptic doublings 

can be saved, compared to general scalar multiplication. The cost we have to pay 
for this improvement is a small amount of on-line precomputation (i.e., base-</) 
expansion of k and (n — 1) evaluations of </>), which costs less than a few elliptic 
additions. 

The right-hand side of equation (15) can be efficiently evaluated using the 
signed binary algorithm with optimal signed encoding of ki’s [18] (this is actually 
the same as Type-II expansion in [9]). Since an optimal signed encoding of a f-bit 
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integer can produce an integer of bit-length at most t + 1 and probability of a 
bit being zero 2/3, this computation can be done in (m — 1) elliptic doublings 
and — 1) elliptic additions on average. 



n — 7 



n = 11 



n = 13 



ki 


ko 


ki 


k2 


fcs 


hi 




fee 





k2 


ki 


ko 


kr 


fee 


fcs 


hi 




fcio 




ko 



ki 


ko 


k2 


ki 


ko 


ko 


ko 


kr 


ko 


ko 






ki2 


fell 


kio 



Fig. 2. Arrangements of fcds for base-</> scalar multiphcation using Algorithm LL 



Further speed-up can be achieved using Algorithms LL/LL-SA, as can be 
expected from equation (15). Figure 2 shows some possible (actually best on 
average) arrangements of ki’s for using Algorithms LL/LL-SA. Here we only 
consider three field extensions of degree 7, 11 and 13, since they are most 
interesting in practice. Unlike Algorithms LL/LL-SA in Sect. 3, we now have 
to do the precomputation required on-line. It is easy to see that the costs of 
on-line precomputation for the configurations shown in Figure 2 are given by 
ClLLp(7m,4,2) = 15Ae, CLLp(llm, 3, 4) = 13Ae, and CLLp(13m, 3, 5) = 14Ae, 
respectively. Since it is preferable to do the precomputation in alline coordinates, 
we can obtain some speed-up with simultaneous inversion. In this case, the costs 
are given by C'LLpi^rn, 4, 2) = 2/ -|- 94. 5M, C'LLp(llm, 3, 4) = 2/ -|- 81. IM, and 
C'LLp(13m, 3, 5) = 2/ -h 87. 8M, respectively. 

Since the average Hamming weight of ki’s can also be reduced to approxi- 
mately I with some clever weight minimization strategy, we can obtain average 
performances for the evaluation of equation (15) using Algorithms LL/LL-SA 
by substituting a — vm, b — m, I — nm and g = 2/3 in equations (3) and (8): 

CbLainm, h, v) — CLLp{nm, h, v) + 

CLL-SAa{nm, h, v) — CLLa{nm, h, v) — AC(ca), where 

f f hv — n\ f hv — n\\ 

Table 6 shows the number of elliptic additions and field inversions required for 
three methods of evaluating equation (15), where the computational costs for 
base-(/ expansion and <f> evaluations are not included. 

Finally, it is worth noting that though we can obtain much higher elliciencies 
using Frobenius expansion with subfield curves, we should be careful for their 
security consequences. The structure allowing faster implementations may also 
allowfaster attacks (e.g., see [19]). #if/GF(p™) (the order of if/GF(p™)) divides 
#if/GF(p”) if m divides n, so ^E/GF{p^) contains at least small prime factors 
of size^E/GF{p). Since we have to use a prime order subgroup for EGG, this 
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coord. 


n 


m = IpI 


signed binary 


Algorithm LL 


Algorithm LL-SA 


Affine 


7 


28 


83.0-1307.1 


70.2-1372.4 


55.4-1416.8 


11 


16 


68.0-1251.6 


58.7-1305.0 


33.8-1379.6 


13 


14 


67.9-1251.2 


59.1-1311.5 


30.9-1396.0 


Proj. 


7 


28 


1.0-1978.2 


3.0-1812.1 


4.0-1729.2 


11 


16 


1.0-1802.0 


3.0-1701.9 


5.0-1560.3 


13 


14 


1.0-1800.8 


3.0-1720.2 


6.0-1553.8 



Table 6. Average performances (Ni + Nm) of three algorithms for base-</> scalar mul- 
tiphcation 



may increase the order of subfield curves more than necessary. Furthermore, 
the small prime factors in ^E/GF{p^) may considerably weaken the resulting 
cryptosystem in many applications if proper precautions are not taken (by the 
small order subgroup attack in [12]). 



6 Implementation and Discussion 

We have implemented Algorithms LL, LL-SA and LL-SM on two dilferent archi- 
tectures: Pentium II/266MHz (32-bit //P; Windows 98, MSVC 5.0 with in-line 
assembly) and Alpha 21164/533MHz (64-bit //P; Linux, GCC 2.95 with in-line 
assembly). Table 7 summarizes the parameters used for field constructions. The 
three field parameters with degree of n* (n = 7, 11, 13) were included for use 
in building subfield curves. The figures of the ‘order’ column in Table 7 denote 
the largest possible prime orders (in bits) in if/GF(p”). See [13] for details on 
selection criteria of field parameters and timings for field/EC arithmetic. 



field 


n 


order 


P 


irred. poly. 


GF(2^®2) 


162 


162 




^l®2+^27^1 


GF(p") 


13* 


168 


1 

CO 


r'®-2 


12 


168 


2"“ -3 


- 2 


11* 


160 


2^® - 437 


-2 


10 


160 


2^® - 165 


O 

1 


Y* 


168 


2^® - 57 


x‘ - 2 


6 


168 


2^^® - 165 


r® - 2 


5 


160 


2®2 - 5 


r® -2 


3 


171 


2®^ - 13 


r® -2 


2 


178 


2®9 _ ^ 


x^ — 3 


1 


160 


p = 2^®® - 2933 



Table 7. Field constructions for ehiptic curves 
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For better understanding of this presentation, we provided Table 8 sum- 
marizing various speed ratios between field operations and elliptic doubling to 
addition.® From Table 8, we can see that our assumptions given in Sect. 2. 3 are 
quite reasonable at least on P6 and Alpha family microprocessors, i.e.,A/M 
0.15,5/M 0.8 in GF(p") and S/M 0.15 in GF(2"). Also note that I/M 

ranges from 5 to 7 for most fields (except for GF(p), GF(p®) and GF(2”)). 



mP 


Pentium ll/266MHz 


Alpha 21164/533MHZ 


Field 


A/M 


S/M 


I/M 




A/M 


S/M 


I/M 




GF(2^®2) 


0.03 


0.13 


14.0 


0.47 


0.05 


0.16 


10.5 


0.48 


GF(p^®) 


0.17 


0.73 


5.54 


0.72 


0.15 


0.59 


4.99 


0.66 


GF(p^2) 


0.18 


0.74 


6.63 


0.73 


0.15 


0.61 


6.11 


0.64 


GF(p“) 


0.12 


0.77 


6.46 


0.71 


0.14 


0.62 


6.39 


0.69 


GF(p^“) 


0.14 


0.78 


6.04 


0.77 


0.18 


0.66 


5.98 


0.70 


GF(p") 


0.11 


0.79 


6.05 


0.73 


0.14 


0.87 


6.17 


0.72 


GF(p®) 


0.13 


0.82 


6.41 


0.73 


0.18 


0.87 


5.79 


0.70 


GF(p®) 


0.18 


0.82 


5.89 


0.74 


0.12 


0.81 


4.60 


0.76 


GF(p®) 


0.18 


0.80 


7.59 


0.75 


0.15 


0.89 


6.88 


0.74 


GF(p2) 


0.16 


0.86 


19.9 


0.73 


0.13 


0.90 


10.4 


0.75 


GF(p) 


0.15 


0.88 


42.7 


0.74 


0.12 


0.85 


31.7 


0.75 



Table 8. Speed ratios of held and ehiptic curve operations 



Timings for Algorithms LL/LL-SA/LL-SM on Pentium II/266MHz are given 
in Table 10, and timings on Alpha 21164/533MHz are given in Table 11. Here 
are some observations on the implementation results: 

— As expected from the analysis in Sect. 3 (compare the I /M ratios in Tables 
3 and 4 with those in Table 8), Algorithms LL and LL-SA yield better 
performances in projective coordinates than in alline coordinates for GF(2”) 
and GF(p”) with n < 3. 

— Algorithm LL-SA improves over Algorithm LL by about 10 to 25% in either 
coordinates, with some exceptions in GF(p), GF(p®) and GF(2”). The ex- 
ceptions in these fields are actually expected from the speed ratio of J/M in 
Table 8 (i.e., much higher values of I/M). 

— Compared to individual scalar multiplication using Algorithm LL-SA in pre- 
ferred coordinates, simultaneous scalar multiplication using Algorithm LL- 

® The hgures in Table 8 are different from the figures in Tables 9-11 in [13]. At the 
time of writing [13], we didn’t implement the held inversion method using exponen- 
tiation from [2] (Algorithm BP, for short). Though the multiphcative complexity of 
Algorithm BP seems higher than that of Algorithm IM in [13], our actual imple- 
mentations show that Algorithm BP runs about 20 to 30% faster than Algorithm 
IM due to smaller overheads in other simple operations and loop controls. 
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SM can significantly reduce the time per scalar multiplication (up to 40% 
for GF(p)). 

— Algorithms LL/LL-SA/LL-SM can achieve 2 to 10 times speedup over the 
ordinary signed window algorithm for scalar multiplication. 

Timings for elliptic scalar multiplication using Frobenius expansion are given 
in Table 9. We can see that Algorithm LL achieves about 15 to 20% improvement 
over the signed binary algorithm and that Algorithm LL-SA again improves over 
Algorithm LL by 5 to 15%. 





algorithm 


w/o Frob. 


binary 


Alg. LL 


Alg. LL-SA 


held 


|fc| 


A 


P 


A 


P 


A 


P 


A 


P 


Pentium 

If 

266MHz 


GF(p^®) 


178 


4.19 


3.48 


1.66 


1.86 


1.43 


1.58 


1.34 


1.36 


“gf(7T 


160 


5.17 


4.07 


2.03 


2.11 


1.78 


1.81 


1.62 


1.58 


GF(pO 


168 


3.03 


2.42 


1.40 


1.44 


1.24 


1.24 


1.19 


1.16 


Alpha 

21164 

533MHz 


GF(p^®) 


178 


3.19 


2.51 


1.27 


1.41 


1.08 


1.17 


1.03 


1.04 


GF(p^^) 


160 


3.15 


2.23 


1.27 


1.24 


1.09 


1.05 


1.01 


0.95 


GF{P') 


168 


1.75 


1.32 


0.79 


0.77 


0.69 


0.67 


0.66 


0.63 



Table 9. Timings for scalar multiplication using Frobenius expansion (in msec, A: 
affine, P: projective) 



7 Conclusion 

Simultaneous inversion is a simple but poweful technique to speed up elliptic 
curve arithmetic with high degree of parallelism. Lim-Lee’s algorithm for elliptic 
scalar multiplication for a hxed point (Algorithm LL) allows a very high degree 
of parallelism and thus can be substantially speeded up using the simultaneous 
inversion technique. This paper investigated such improvement on Lim-Lee’s 
algorithm. More specihcally, we presented and analyzed improved Lim-Lee’s al- 
gorithms using simultaneous inversion: Algorithm LL-SA for computing a single 
scalar multiple and Algorithm LL-SM for computing many scalar multiples at 
a time. Implementation results of these algorithms on Pentium II and Alpha 
21164 microprocessors were also provided to demonstrate practical performance 
improvement. We also showed that the presented algorithms can be used to 
speed up general elliptic scalar multiplication using Frobenius expansion. 
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2.06 
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0.74 
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R 
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0.48 
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0.47 


0.58 


0.85 
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0.75 


0.84 
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0.36 


0.41 
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0.73 
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0.65 


0.72 
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0.31 


0.35 


0.43 
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0.56 


0.62 
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0.30 


0.37 
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0.65 


0.48 


0.53 


0.80 
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0.84 
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0.69 


0.77 


0.95 


1.38 


1.63 


1.23 


1.36 


2.01 


T 


L 


3x4 


0.66 
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0.68 


0.52 


0.58 


0.71 


1.03 


1.23 


0.92 


1.02 


1.54 


1 


L 


4x4 


0.53 


0.47 


0.52 


0.40 


0.44 


0.54 


0.79 


0.94 


0.71 


0.78 


1.22 


V 


1 


5x4 


0.45 
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0.43 


0.32 


0.36 


0.45 


0.65 


0.78 


0.58 


0.64 


1.01 


E 


S 


6x4 


0.41 


0.34 


0.37 


0.28 


0.32 


0.39 


0.56 


0.67 


0.51 


0.56 


0.89 




A 


7x4 


0.37 


0.30 


0.32 


0.24 


0.28 


0.33 


0.49 


0.58 


0.44 


0.48 


0.78 






8x4 


0.33 


0.27 


0.28 


0.21 


0.24 


0.29 


0.42 


0.50 


0.38 


0.42 


0.69 



Table 10. Timings (in msec) for Algorithms LL, LL-SA and LL-SM for computing kG 
with |fc| = 160 on Pentium ll/266MHz (timings for Algorithm LL-SM denote timings 
per scalar multiphcation for t — 100) 























































































































































































































































































































































































































































































Speeding Up Elliptic Scalar Multiplication with Precomputation 119 





held 


GF(p") 


GF ( 2 "-) 




n 


1 


2 


3 


5 


6 


7 


10 


11 


12 


13 


162 




Win. Alg. 


4.22 


1.80 


1.06 


1.41 


1.14 


1.63 


2.61 


3.14 


3.08 


3.05 


2.82 






2 


X 


4 


1.75 


0.72 


0.42 


0.53 


0.45 


0.62 


1.00 


1.24 


1.12 


1.16 


1.18 






3 


X 


4 


1.32 


0.54 


0.32 


0.40 


0.34 


0.47 


0.76 


0.94 


0.85 


0.87 


0.90 




L 


4 


X 


4 


1.02 


0.42 


0.24 


0.31 


0.26 


0.36 


0.58 


0.72 


0.65 


0.67 


0.69 




L 


5 


X 


4 


0.83 


0.34 


0.20 


0.25 


0.21 


0.30 


0.48 


0.59 


0.54 


0.55 


0.56 


A 




6 


X 


4 


0.71 


0.29 


0.17 


0.22 


0.18 


0.25 


0.41 


0.51 


0.46 


0.47 


0.48 


F 




7 


X 


4 


0.60 


0.25 


0.15 


0.18 


0.16 


0.22 


0.35 


0.43 


0.39 


0.40 


0.41 


F 




8 


X 


4 


0.52 


0.21 


0.13 


0.16 


0.13 


0.19 


0.30 


0.37 


0.34 


0.34 


0.35 


1 




2 


X 


4 


1.09 


0.56 


0.35 


0.49 


0.40 


0.55 


0.88 


1.05 


0.97 


1.04 


0.88 


N 


L 


3 


X 


4 


0.79 


0.41 


0.26 


0.37 


0.30 


0.41 


0.66 


0.78 


0.72 


0.78 


0.65 


E 


L 


4 


X 


4 


0.58 


0.31 


0.20 


0.28 


0.23 


0.31 


0.50 


0.59 


0.55 


0.59 


0.49 




1 


5 


X 


4 


0.47 


0.25 


0.16 


0.23 


0.19 


0.25 


0.40 


0.49 


0.44 


0.49 


0.40 




S 


6 


X 


4 


0.41 


0.22 


0.14 


0.20 


0.16 


0.22 


0.35 


0.42 


0.38 


0.42 


0.34 




A 


7 


X 


4 


0.35 


0.19 


0.12 


0.17 


0.14 


0.19 


0.30 


0.36 


0.33 


0.36 


0.29 






8 


X 


4 


0.30 


0.16 


0.11 


0.15 


0.12 


0.16 


0.26 


0.31 


0.28 


0.31 


0.25 






2 


X 


4 


0.42 


0.41 


0.31 


0.49 


0.39 


0.50 


0.80 


0.92 


0.87 


1.00 


0.57 




L 


3 


X 


4 


0.31 


0.31 


0.23 


0.36 


0.29 


0.37 


0.58 


0.68 


0.65 


0.74 


0.43 




L 


4 


X 


4 


0.24 


0.23 


0.18 


0.28 


0.22 


0.28 


0.44 


0.52 


0.49 


0.56 


0.33 




1 


5 


X 


4 


0.19 


0.19 


0.14 


0.22 


0.18 


0.23 


0.36 


0.42 


0.40 


0.45 


0.26 




S 


6 


X 


4 


0.17 


0.16 


0.12 


0.19 


0.15 


0.20 


0.31 


0.36 


0.34 


0.39 


0.23 




M 


7 


X 


4 


0.14 


0.14 


0.11 


0.17 


0.13 


0.17 


0.27 


0.31 


0.30 


0.33 


0.19 






8 


X 


4 


0.12 


0.12 


0.09 


0.14 


0.11 


0.14 


0.23 


0.26 


0.25 


0.29 


0.17 




Win. Alg. 


1.13 


1.06 


0.79 


1.24 


0.90 


1.23 


2.00 


2.24 


2.12 


2.39 


1.22 






2 


X 


4 


0.56 


0.57 


0.41 


0.63 


0.47 


0.63 


1.02 


1.18 


1.10 


1.28 


0.77 






3 


X 


4 


0.43 


0.44 


0.31 


0.48 


0.36 


0.48 


0.78 


0.91 


0.84 


0.98 


0.59 


P 


L 


4 


X 


4 


0.34 


0.34 


0.24 


0.38 


0.28 


0.38 


0.61 


0.71 


0.66 


0.77 


0.47 


R 


L 


5 


X 


4 


0.28 


0.28 


0.20 


0.31 


0.23 


0.31 


0.50 


0.59 


0.55 


0.63 


0.38 


O 




6 


X 


4 


0.24 


0.24 


0.17 


0.27 


0.20 


0.27 


0.43 


0.50 


0.47 


0.54 


0.33 


J 




7 


X 


4 


0.21 


0.21 


0.15 


0.23 


0.17 


0.23 


0.37 


0.43 


0.40 


0.47 


0.29 


E 




8 


X 


4 


0.19 


0.18 


0.13 


0.20 


0.15 


0.20 


0.32 


0.37 


0.35 


0.40 


0.25 


C 




2 


X 


4 


0.50 


0.45 


0.33 


0.52 


0.39 


0.52 


0.85 


0.96 


0.92 


1.04 


0.63 


T 


L 


3 


X 


4 


0.39 


0.35 


0.25 


0.39 


0.30 


0.39 


0.64 


0.73 


0.68 


0.79 


0.49 


1 


L 


4 


X 


4 


0.31 


0.27 


0.19 


0.30 


0.23 


0.30 


0.49 


0.56 


0.53 


0.60 


0.38 


V 


1 


5 


X 


4 


0.26 


0.22 


0.16 


0.25 


0.19 


0.25 


0.40 


0.47 


0.43 


0.50 


0.32 


E 


S 


6 


X 


4 


0.24 


0.19 


0.14 


0.22 


0.17 


0.22 


0.35 


0.40 


0.38 


0.43 


0.28 




A 


7 


X 


4 


0.21 


0.17 


0.12 


0.19 


0.14 


0.19 


0.30 


0.35 


0.33 


0.37 


0.24 






8 


X 


4 


0.19 


0.15 


0.11 


0.16 


0.13 


0.16 


0.26 


0.30 


0.28 


0.32 


0.21 



Table 11. Timings (in msec) for Algorithms LL, LL-SA and LL-SM for computing hG 
with |fc| = 160 on Alpha 21164/533MHz (timings for Algorithm LL-SM denote timings 
per scalar multiphcation for t = 100) 
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Abstract. The design rationale for many key distribution schemes for 
multicast networks are based on heuristic arguments on efficiency, flexi- 
bility and scalability. In most instances the choice of key server placement 
in a multicast network architecture is based on intuitive cryptographic 
considerations. We use an analytical model of multicast group formation 
and network growth to look at the selection of a key distribution scheme 
from a network operation perspective. Thereafter, this model is used to 
validate the choice of hierarchical (hybrid) key distribution model as the 
most appropriate. 

Keywords: Network security. Multicast networks. Key distribution ar- 
chitectures 



1 Introduction 

The phenomenal growth of wide area networks, in the form of ubiquitous Inter- 
net^ have given rise to many new applications that are different from the typical 
one-to-one (unicast) communication model of standard network applications. 
Many of the new applications in information distribution and collaborative ac- 
tivities such as web-casting, shared white-boards, on-line auctions, etc., have a 
one-to-many (multicast [6, 7]) model of communications. There are two main 
reasons that motivate the use of multicast for highly distributed network appli- 
cations: 

1. The number of messages a sender needs to transmit is reduced. This is due 
to the fact, that a single multicast address represents a large number of 
individual receivers. This results in a lower processing load for the sender 
and also simplifies the application design. 

2. The number of messages in-transit over the network is reduced. As the cor- 
rect message delivery is handled by multicast-capable routers, which nor- 
mally make redundant copies of a message only when transmitting on di- 
vergent network links, data meant for a group of receivers is transmitted as 
a single message for most part of the network. This in turn improves the 
overall network bandwidth utilization. 

* Since Sept. 1999, author has been with Vrije Universiteit, Department of mathemat- 
ics and computer science, De Boelelaan 1081a, 1081 HV Amsterdam, The Nether- 
lands, leiwo@cs.vu.nl 

JooSeok Song (Ed.): ICISC’99, LNCS 1787, pp. 120-131, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




Why Hierarchical Key Distribution Is Appropriate for Multicast Networks 



121 



Therefore, multicast data transmission provides significant benefits to both 
the applications and the network infrastructure and consequently is an impor- 
tant network technology for emerging applications. The basic difference between 
broadcast networks and multicast networks is that in multicast, delivery is to a 
specifically targeted group. This group may be created based on many metrics 
such as affiliation to a certain institution, long-duration membership subscrip- 
tions, short-duration tickets, etc. Many of the group management functions such 
as join, leave or re-join that control membership of a multicast group require 
cryptographic techniques to ensure that integrity of the control process is not 
compromised by malicious users or intruders. Furthermore, the multicast appli- 
cation itself may require secure data transmission to and from members. As the 
communication model of multicasting is different from unicast communication, 
the attacks and threat models are also different for multicast networks and in 
fact more severe [2]. 

To provide secure group management services, standard security functions 
such as identification, authentication and message transmission with confiden- 
tiality and integrity are required. The basic support service for secure group 
management in multicast networks is session key distribution which incorpo- 
rates the primary functions of member identification, authentication and session 
key transport. The key distribution schemes described in literature can be classi- 
fied under three basic models of centralized, distributed or hierarchical as shown 
in figure 1. In the fully distributed scheme, although shown as a tree, a fixed 
root may not be physically present. 





Fig. 1. Standard multicast group control methods. The fully distributed method 
shown in (b) requires horizontally structured coordination among participating 
controller nodes 



Motivation. The cryptographic research literature is replete with sophisticated 
key distribution architectures for multicasting based on wide ranging assump- 
tions while the networking community have adopted only a handful of techniques 
in proposed or experimental secure multicasting schemes. The work presented 
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in this paper was motivated by the inadequate consideration given to network- 
centric issues when developing solutions that are grounded in cryptography. 

Organization. In section 2, we overview several secure multicast schemes to see 
if their key distribution scheme selection is based on network considerations or 
cryptographic issues (or a combination of both). In section 3, we develop analyt- 
ical arguments from a network perspective to validate the choice of hierarchical 
key distribution as the preferred framework. We make concluding remarks in 
section 4. 

2 Related Work 

In general, control and routing tree structure selection (shared trees, shortest- 
path trees, etc.) and protocol algorithm design for multicasting is based on ex- 
pected sparseness/denseness of multicast group, efficiency in terms of number 
of messages, low message propagation delay, ease of recovery from message loss 
and low overhead in group management. For secure multicasting, in which the 
main design aspect is the key distribution scheme, designers may opt to con- 
sider underlying multicast network characteristics or mainly use cryptographic 
metrics such as number of rounds required for key distribution, size of security 
control messages and key update/change techniques. Next we briefly review pre- 
vious work from literature that have taken different approaches to implementing 
secure multicasting. 

A design for a secure key distribution architecture is presented in [14] that is 
overlaid on the core-based tree (CBT) multicast routing protocol [3]. The justi- 
fication for the hybrid control structure of [14] in which key distribution centers 
(KDC) are co-located with routers is based on the favorable characteristics of 
the multicast protocol rather than on multicast network structure itself. Among 
the main reasons given for the use of CBT framework for key distribution are 
the pre-existing scalability properties of the routing protocol, close relationship 
between grouping structure and router placement and the ability to combine 
processing workload for router setup and key distribution. Early work on key 
distribution schemes based closely on underlying multicast protocol structures 
appeared in [1, 2, 11]. 

Similarly, the lolus secure multicasting framework [15] is based on a dis- 
tributed tree of group security intermediaries (GSI) for subtrees and an over- 
all group security controller (CSC) for coordination of GSIs. The collection of 
these group security agents constitutes a hybrid key distribution architecture. 
However, the framework is designed to operate over many different multicast 
protocols including GBT and protocol independent multicasting (PIM) [7]. The 
distributed registration and key distribution (DiRK) technique presented in [16] 
is another multicast protocol independent decentralized and distributed model 
that simply assumes a hybrid model is better suited for large scale multicast 
groupings. Similar proposals appear in [9] 

In contrast, the SecureRing suite of group communication protocols [12] use 
multicasting and a fully distributed control structure to provide membership 
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management and message distribution under Byzantine errors but does not de- 
pend on any particular characteristics of the underlying multicast routing proto- 
col for efficient or reliable operation. Their scheme uses cryptographic message 
digests and Byzantine fault detectors among other techniques to achieve effi- 
ciency and reliability. Similar cryptographic protocol based work also appear in 
[4, 8, 10, 13]. 

In summary, we can see that most key distribution schemes for secure mul- 
ticasting use the hybrid model of key server placement. While this approach is 
intuitively reasonable, there are no analytical basis to support the model se- 
lection. In the next section, we analyze the growth and formation of multicast 
groups in wide area networks to provide evidence for the correctness of choosing 
a hybrid model. 



3 Analysis of Key Distribution Agent Placement Models 

We start our analysis using a regular tree structure which is more tractable than 
a general network topology. Consider a multicast distribution tree as shown in 
figure 2 with arity k and depth D where all the leaf nodes represent hosts that 
could be potential members of a multicast group. The inner nodes represent 
routers and the nodes at depth D—1 denotes sites (or local clusters). Therefore 
we have a regular network structure with total number of hosts M = and 
total number of sites m = k^~^. 




host (leaf) O 



4-ary tree 



depth- D 



Fig. 2. The basic fc-ary tree used to model the multicast distribution tree 



3.1 Clustering of Hosts in the Multicast Distribution Tree 

First we look at the effect on key distribution schemes due to the clustering of 
hosts. When we select a number of hosts to create a multicast group (say, of total 
size n), they could be arbitrarily distributed among several sites. While a single 
member multicast group will have a node from only a single site, a two member 
multicast group can select nodes from one or two distinct clusters. Following 
this argument, we can determine the best possible and worst possible clustering 
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of hosts in sites when creating a multicast group. The plots of the two curves 
(equations 1 and 2) are shown in the graph of figure 3. 



Best case curve: m = 



'n' 

k 



( 1 ) 



Worst case curve: m = minjn, 



( 2 ) 




Fig. 3. The graph of number of distinct sites vs. number of hosts in the multicast 
group shows the allowable variability in denseness/sparseness for a multicast 
group of given size 



We make following observations on the distribution of hosts in sites when 
setting up a multicast group as derived from the uniform tree structure: 

1. The conventional sparse region was defined based on the observation that it 
has relatively very small number of hosts in the group and therefore even in 
worst case can only get distributed into few sites. The recommend key dis- 
tribution architecture for this scenario is the centralized model. In using any 
other model, the multicast network will be needlessly using key distribution 
(sub)agents in inner nodes where most will be unused. In this region, the 
main issue is efficient use of security agents and not scalability. 

2. The conventional dense region was defined based on the observation that it 
has relatively very large number of hosts in the group and therefore even 
in the best case can easily get distributed to nearly all the sites. In this 
instance, the recommended architecture is the distributed model. Any other 
model will create a bottleneck situation at the root affecting performance 
and also make it difficult for the key distribution architecture to scale with 
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the growth of the multicast network. In this region, the main issues concern 
both efficiency and scalability. 

3. From a practical sense, the most interesting region is the middle area where 
the variability range is significant. Essentially, this means we might have ei- 
ther a densely populated or sparsely populated multicast network depending 
on the host distribution among site. Given the large range of sites (m) to 
which a multicast group of given size (n) can form into, it is quite impracti- 
cal to discuss an average case scenario. The standard approach would be to 
use the hierarchical model as the key distribution agent architecture. 



3.2 Total Size of the Multicast Distribution Tree 



Next we look at the effect of clustering of hosts on the total size of the multicast 
distribution tree. For the purpose of analyzing the cost of message distribution, 
we assume a fixed transmission cost for any link in the multicast tree. For a 
multicast distribution tree represented as a uniform tree structure, the lowest 
total cost is obtained when hosts are densely located in the smallest possible 
number of sites as shown in figure 4 (a). The total size of the distribution tree 
L for a multicast group with n members is obtained by progressively counting 
the total number of links in all the full sub trees below a given level from top 
to bottom as shown in equation 3. The quantity (fn denotes the total number of 
nodes counted prior to level I and the value pi accounts for the link traversed 
when moving to the next level below to process a partially filled sub tree. 




(a) Shortest path length grouping 
for multicast hosts 




(b) Longest path length grouping 
for multicast hosts 



Fig. 4. The best and worst case grouping with respect to number of network 
links over which messages should pass are given by (a) depth- first search tree 
and (b) breadth-first search tree 
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Best case curve: 



L{n) 







where 4>i 



fo 




+ 

1 




fcC-(I-l) 



X 



1 = 0 
I > 0 



(3) 



/ 0 n= 4>i-i 
= otheTw^se 

The highest total cost for a multicast distribution tree occurs when the hosts 
are sparsely distributed among as many sites as possible as shown in figure 4 
(b). The distribution is limited by the saturation value ip shown in equation 4 
which the maximum number of clusters possible. Several sample plots of the two 
curves are shown in figure 5. 



Worst case curve: 

- i)) + {n- k^){D - p) 

where 



n < k 
k^ < n < 



( 4 ) 

Previously we have discussed non-random clustering of hosts to form a mul- 
ticast distribution tree in order to study the worst case and best case costs of 
the delivery tree. Next we look at the random formation of a multicast tree to 
analyze the total delivery cost for average case. When a host is selected at the 
leaf level of the tree to form a multicast group, at level I, a route through one 
of fc* links need to be selected. Therefore, the probability that a given link at 
level I is in the multicast delivery tree is Furthermore, the probability of a 
link being used in the delivery tree after n hosts have been selected at leaf level 
isl— (l — -p) .If hosts are being selected at random at leaf level to form the 
multicast group, the average number of links at level I that will be included in 
the delivery tree isfc*(l — (l— -p) ). Finally, assuming the link selection pro- 
cess to be a set of independent events, the total size of the multicast tree for a 
group with n members can be expressed as equation 5 (this result appears in 
[17] also). 



The set of graphs in figure 6 plots the curves for best, average and worst case 
scenarios for the same k and D. As can be seen from the graphs, the average 
cost of the multicast delivery tree is closer to the worst case cost for small (and 
therefore sparse) groups and tends toward best case cost for large (and therefore 
dense) groups. This result is intuitively correct and validates the expressions 
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Fig. 5. The graph plots the total size of the multicast distribution tree (in terms 
of inter-node links) vs. number of hosts assembled in to both worst case and 
best case multicast groupings. The regular trees have k values 2 (£) = 15), 3 
{D = 10) and 4 (H = 8) 



developed previously to analyze the structure of the multicast tree with respect 
to clustering of hosts. However, as can be seen from the graphs, the accuracy of 
the average case curve is lost as the number of hosts increase where the curve 
dips below the best case result. 



k=2 D=15 


«»» 


k=3D=10 


u»o 


k=4 D=8 


worst ^ ' avg 




avg 








L 






avg , ' 












/best 











n n n 



Fig. 6. Total size of the multicast distribution tree vs. number of hosts 



The outcome of the foregoing analysis is that for most values of the multicast 
group size (n), the total size of the multicast distribution tree (L) can vary widely. 
This behavior again leaves the hierarchical key distribution architecture as the 
preferred option. 
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3.3 Applicability of Results to General Multicast Trees 

Our analysis so far was based on uniform multicast distribution trees. However, 
practical multicast distribution structures normally take the shape of irregular 
trees. An important question at this point is, how relevant the results of an anal- 
ysis based on uniform trees to real multicast networks? To answer this question, 
we look at the results obtained by Chuang and Sirbu [5] on the relationship 
between multicast distribution tree size and size of the membership for general 
multicast networks. According to the Chuang-Sirbu scaling law, the normalized 
multicast tree cost is directly proportional to the 0.8 power of the group size 
(shown in equation 6) for randomly selected group members. The normalized 
tree cost is obtained as the ratio between total multicast distribution tree length 
(Lm) and average unicast delivery path length (L„). 



Lrr 



oc n 



0.8 



J general 



(6) 



We can compute the normalized tree cost for the uniform distribution tree 
with random member selection using equation 5. The average unicast tree length 
in this case is the tree depth D. Therefore, for the uniform multicast tree, the 
normalized tree cost can be given as equation 7. 




[L. 



uniform 



L{n) 

D 





( 7 ) 




Fig. 7. The graph of normalized distribution tree cost vs. multicast group size 
with constant of proportionality for Chuang-Sirbu curve set at 1.5 



The graph in figure 7 shows that the shape of normalized distribution tree 
curves for different k values of uniform trees follows that of the general curve 
due to Chuang-Sirbu scaling law for the range of n in which the average curve 
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lies between best case and worst case curves of figure 5. The selection of the 
proportionality constant is admittedly arbitrary, but its function is simply to 
scale the curves with no distortion of the shape. As shown in the log-scale graph 
of figure 8, the value was selected for a close fit with plots for uniform trees. 
The implication of this matching of curves representing theoretical multicast 
networks to a curve of general multicast networks is, we can expect that for most 
group membership sizes (n), the average distribution cost (L) of real multicast 
networks also to be in the approximate middle of best case (dense) and worst 
case (sparse) values. 




Fig. 8. The graph of normalized distribution tree cost vs. multicast group size 
with constant of proportionality for Chuang-Sirbu curve set at (a) 1.5 and (b) 



1.0 



In summary, the significance of this average total distribution cost curve of 
real multicast networks not being closer to sparse or dense formation of groups is 
that it is not meaningful to use a centralized or fully distributed control structure 
for key distribution. This in turn provides an analytical basis for using the hybrid 
control structure for key distribution. 

4 Conclusion 

A key distribution framework provides the backbone for any secure multicast 
architecture. Although the most widely used model for key distribution is the 
hybrid scheme, the reasons for its selection are usually heuristic arguments of 
flexibility and scalability. In this work we have used a different approach to 
validate the use of hybrid model by providing analytical arguments to exclude 
the use of both centralized and fully distributed control models. Although this 
work is based on key distribution in multicast networks, the results are applicable 
in other contexts such as loss recovery where a hierarchical control structure may 
be used. 
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Abstract. The process of selection is omnipresent in the real world and 
modeling this process as a cryptologic protocol will enable cross use of 
techniques among similar protocol applications, which will eventually 
lead to better understanding and refinement of these applications. We 
present a proposal for a specialised selection protocol with anonymity 
as the security service. An area for its application is anonymous peer 
review, where no peer should know the identity of the reviewer. 



1 Introduction 

Many seemingly different protocol problems share a variety of common proper- 
ties. The reason for this may be because confidentiality, integrity and identifica- 
tion are the basic services that cryptologic protocols employ. Complex services 
can be built using different building blocks. For example: 

1. Signature systems employ integrity and identification services. 

2. Anonymity systems employ confidentiality and identification services. 

3. Blind signature systems employ services of anonymity and signature systems. 

The first step towards developing secure and efficient solutions for such systems is 
to precisely understand their goals. Faster, better and improved understanding of 
such protocols can be achieved when analysing a collection of similar protocol 
problems that possess similar goals. In this paper we shall concentrate on a 
protocol problem that requires setting up of a peer review system. A feature of 
this system is its similarity to many other problem instances currently under 
investigation. We shall call this collection the set of secure selection protocols. 

A second phenomenon that is also common is compliance. Research into a 
special class of cryptosystems called compliant cryptosystems offering services 
to different sets of users with logically contradicting requirements has become 
prominent. The existence of safety valve mechanisms is a fundamental property 
in these systems. Popular examples of such systems are escrowed encryption, fair 
electronic cash, electronic voting and group signature. The term (or concept) 
compliance has been used in the literature either implicitly or explicitly. For 
instance Desmedt [5] introduced society and group oriented cryptography, which 
implicitly identified the issue of compliance when certain functionalities were 
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shared by sets of users. There are informal and explicit discussions of this term 
in the literature [7] . 

This paper will present our proposal for a cryptologic protocol for the design 
of a peer review system, which is a selection process with one-way anonymity 
as the security service. A proposal for transforming the system into a compliant 
(or fair) selection system will also be discussed. 

2 Cryptographic Tools and Primitives 

The important tools and primitives used are: 

1. Proof of knowledge of discrete logarithm. 

2. Proof of partial knowledge of discrete logarithm. 

3. Electronic cash technology. 

2.1 Proof of Knowledge of Discrete Logarithm 

We will use the proof of knowledge introduced by Schnorr [14] in the non- 
interactive mode. Here the prover P has to prove the he knows the discrete 
logarithm of a public value u, where u = mod p and is a publicly known 
generator of the group Z* . The prover performs the following function: 

Begin Function PKGen 

Choose at random k Gr Zp 
Compute r = and c = Ti{u, r) 

d= cv + k (mod p — 1) 

Send to verifier (c, d, r) 

End Function PKCen 



The verifier performs the following function: 

Begin Function PKVer 

, 7 

Check g = u^r 

c = H{u, r) 

If SUCCESS output 1 

Else output 0 

End Function PKVer 

If is a cryptographically secure hash function, the verifier can be convinced 
that the prover knows log^ u mod p when the function PKVer outputs 1. 

2.2 Proof of Partial Knowledge of Discrete Logarithm 

Cramer et al. [3,4] proposed a scheme to transform an interactive proof system 
into a proof system that will convince a verifier that the prover knows some se- 
cret, using a suitable secret sharing scheme with an appropriate access structure. 
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In this section we propose a modification to the witness indistinguishable vari- 
ant of the Schnorr identification protocol [14] proposed in [4] to obtain a more 
computationally efficient protocol construct that can be used for the proof of 
knowledge of discrete logarithm. Our proposal transforms their interactive proof 
system into a non-interactive proof system and applies the screening technique 
used in batch verification methods [15,1] to the protocol proposed in [4]. The 
soundness and completeness properties of the protocol in [4] are not affected 
by the changes when a cryptographically secure hash function is used. This is 
due to the use of standard hashing technique [6] for the transformation. We also 
integrate the Schnorr signature scheme [14], so that the prover will provide the 
verifier with transcripts for the proof that also contains his/her signature. 

Suppose that a set of values U = {ui = g^' \ i = 1, - ■ ■ ,n} are publicly known 
and a prover, possessing the public key yj {yj = g^^), wishes to prove to a verifier 
that he/she knows the discrete logarithm of at least one of the public values. 
For this to happen the verifier must allow the prover to simulate (or cheat) at 
most n — 1 proofs. Assume that the prover knows Vj, which is the secret value 
corresponding to Uj for some j G {1, • • • , n}. The prover performs the following 
function: 

Begin Function PPKGen 

Choose at random kj €r Zp, {c;, di €r Zp \ I ^ j} 



Compute 


II 


(1) 




{n = \ l^j} 

n 


(2) 




r = Ti mod p 


(3) 




i—i 

c=H{ui, - ■ ■ ,Un,r) 


(4) 




Cj=C-J2<^l 


(5) 




d = kj — VjCj — XjC + di (mod p — 


1) (6) 


Send to verifier 


{d, ,c, {a 




End Function PPKCen 


The verifier performs the following function: 




Begin Function PPKVer 


Check 


c = n{ui, g‘^y] wf ) 

2 = 1 


(7) 




n 


(8) 


If SUCCESS 


i—i 

output 1 




Else 


output 0 





End Function PPKVer 



If the function PPKVer outputs 1 when the transcripts from the prover are 
provided as inputs, then the verifier can, with a very high probability, decide 
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that the prover knows the discrete logarithm of at least one of the n public 
values. 

Computational Requirements: The function PPKGen requires 2n — 1 modular 
exponentiations and the function PPKVer requires n + 2 modular exponentia- 
tions. 



Analysis. We shall assume that the hash function, Ti, used for the proof of 
partial knowledge, is cryptographically secure. 

When \U\ = 1, i = j and the proof is the usual proof for knowledge of discrete 
logarithm. So the verification equation will be of the form, 

c = (9) 

which is a standard Schnorr signature with the signature inputs to the hash 
function of the form g’^ijjjUjY . It is evident that this transcript can be formed 
without the knowledge of discrete logarithm of both yj and Uj, if and only if 
the Schnorr signature can be forged. Thus, based on the assumption that the 
Schnorr signature is unforgeable, the above equation is sound, in that the prover 
cannot cheat the verifier. 

When yj = 1, equation 7 will be of the form. 



c=H{ui,- ■ ■ ,Un,g‘^Y[u';') (10) 

i=l 

which is the verification equation for the non-interactive version of the protocol 
proposed in [4]. If the prover can form this equation without the knowledge of 
discrete logarithms for any of the Mi’s then the protocol construct proposed by 
Cramer et al. [4] is flawed or the standard hashing technique proposed by Fiat 
and Shamir [6] is flawed. Thus on the assumption that both the techniques [6,4] 
are not flawed, the above equation is the sound and complete, in that the prover 
cannot cheat the verifier and the honest prover will generate transcripts that 
will be accepted by the verifier. 

Observe that the Schnorr signature scheme [14] and the partial proof of 
knowledge protocol [4] are derivatives of the Schnorr identification protocol [14]. 
The Schnorr identification scheme is a three move protocol, the moves being com- 
mitment, challenge and response. Our protocol constrains the prover to use the 
same commitment and challenge to generate (two) different responses, namely 
Schnorr signature and partial proof of knowledge, that can be independently 
interpreted by the verifier. Thus, the proposed protocol constrains valid tran- 
scripts to contain a message tuple (mi, • • • , m„), commitment (r) , challenge (c), 
and the response (d) that is interpreted as a Schnorr signature on the message 
tuple and proof of knowledge of at least one discrete logarithm in the set of 
public values (mi, • • • , m„). 
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2.3 Electronic Cash Technology 

In our proposal, the electronic cash technology will be employed to generate 
anonymous tokens. This section will provide a brief summary of the fair off- 
line electronic cash scheme proposed by Frankel et al. [8,9], where a conditional 
anonymity service is offered when no coin is spent more than once. It employs 
the restrictive blind signature scheme presented by Brands [2]. The protocol 
specifications for the fair electronic cash system [8] are presented in Appendix A. 
In their proposal the anonymity of a coin can be revoked by a set of Trustees 
(or Ombudsman). 

An e-cash system consists of the withdrawal, payment and deposit protocols. 
Withdraw: In the withdrawal protocol, the client: 

1. Authenticates to the mint and conveys its intention to withdraw some cash. 

2. Generates a random message with a predefined structure and blinds this 
message. 

3. Obtains a signature on the blinded message from the bank using an ap- 
propriate public key corresponding to the denomination and unblinds the 
signature. 

This process is called restrictive blind signature because the client is not allowed 
to obtain the bank’s blind signature on arbitrary messages, but only on messages 
with pre-defined structure (could contains information on identity of the client). 

Payment: In the payment protocol, the client performs the following step 
1 and if anonymity revocation is required it performs step 2. 

1. The client anonymously contacts the merchant and proves that he/she knows 
the representation of the coin (restrictive blind signature on a random mes- 
sage by the mint). In this step, the transcripts bind the identity of the 
merchant to the coin. 

2. The client proves to the merchant that the transcript contains the encryption 
of his/her identity under the public key of a trustee (typically, a distributed 
entity), without revealing the identity. Since the mint would expect the mer- 
chant to provide the transcript of this proof along with the transcript from 
the previous step before crediting the merchants account, the merchant must 
obtain a valid transcript from the client. 

Deposit: In the deposit protocol, the merchant and the mint perform the 
following steps. 

1. The merchant sends the transcripts of the payment protocol along with its 
public key or credentials. 

2. The mint verifies the transcripts and, if successful, credits an appropriate 
amount to the merchant’s account. 

Trace: The mint and the trustees perform the following steps to trace the 
owner of a coin (received from the merchant). 

1. The mint sends the deposit transcripts to the trustee. 
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2. The trustee decrypts the ciphertext that is present in the transcript and 
sends the information on identity of the owner to the mint. 

Actually, there are two forms of anonymity revocation, namely coin tracing and 
owner tracing [8,9]. We are only interested in the owner tracing facility, in this 
paper. This protocol module will be applicable only to the systems in which 
conditional anonymity service, as opposed to unconditional anonymity service, 
is offered. 

3 Protocol Phases 

We propose a three phase protocol schema to solve the peer review problem. The 
actual protocol will be presented in Section 4. The peer review problem consists 
of a set of participants called peers, having two roles in the system, namely 
reviewer and candidate to be reviewed. Since no participant should review itself, 
a solution to the peer review problem is a permutation of a set with no fixed 
points. The properties of the peer review protocol are: 

1. The solution must define a permutation without any fixed points. 

2. Every reviewer is also a candidate. 

3. The solution must provide one-way anonymity service for the reviewers. That 
is the reviewers know the identity of the candidate, but the candidate does 
not know the identity of the reviewer. 

The number of participants in the system must be greater than three, oth- 
erwise the peer review system cannot provide anonymity. Suppose that A, B 
and C are the participants, and the set of ordered pairs containing the reviewer 
and the candidate is {(A, B), {B, C), {C, A)}. A will know that C is its reviewer 
because it is reviewing B and if B is reviewing A then C has to review itself, 
which is not allowed. The reasoning for the case when n = 2 is trivial. 

A challenging (and interesting) problem that is inherent in the problem state- 
ment is that when two participants collude they will be able to obtain some in- 
formation that could weaken the anonymity of honest participants. Clearly, the 
information that colluding participants obtain is inversely proportional to the 
total number of participants in the system and directly proportional to the num- 
ber of colluding participants. We believe that overcoming this problem would be 
difficult without weakening the security services for honest participants. 



3.1 Basic Solution 

A simple solution to solve the peer review problem could consist of three steps. 

Step 1 Every participant wishing to participate in the protocol signs a random 
message and, publishes the signature and message in a publicly readable 
bulletin board, B\. Let the number of signatures in B\ be n, which is 
the number of participants. 
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Step 2 Each participant generates a random pseudonym and anonymously pub- 
lishes its pseudonym in a publicly readable bulletin board B 2 ■ The step 
completes when n pseudonyms are published. Let the set of pseudonyms 
be represented by PS. 

Step 3 Each participant in turn chooses a pseudonym from B 2 , such that it 
does not select the pseudonym it submitted. Let this choice be ps. The 
participant then generates the proof for its knowledge of the secret 
corresponding to one of the pseudonyms in the set PS\{ps}, without 
revealing its pseudonym. It signs its identity, choice and the proof, and 
submits the signature along with the message to a bulletin board B^. 
It also removes its choice from Bi, so that nobody else can make the 
same choice. This phase completes when n valid messages along their 
signatures are present in the bulletin board B 3 . Anyone may check if 
every public key used for verifying the signatures in B\ is also used in 
S3. 

Drawbacks and Solution: The protocol proposed assumes honest participants, 
which may not be very desirable. The protocol has the following drawbacks: 

PI Two participants, say i and j, can reveal their pseudonyms as Ui and Uj to 
each other, so that they can select each other. 

P2 Two participants, say i and j, can generate the transcripts in Step 3 for each 
other, so that they can select themselves. 

P3 Since Vi is only a short term secret, participant i can reveal this value to j, 
so that j can select twice. This would allow j to select itself. 

P4 The system does not provide anonymity revocation, which may be required 
in common applications. 

P5 An attacker can mount a denial of service attack on the system and be 
unidentified, because in Step 2 does not guarantee that only the participants 
involved in Step 1 submit only one pseudonym. 

It seems difficult to overcome problem PI. Moreover, PI does not adversely 
affect the goals of the protocol. But P2 and P3 do adversely affect the goals of 
the protocol. These problems can be solved if the participants are forced to use 
their long term secret values, namely private key corresponding to their certified 
public key, to generate the transcripts in Step 3. P4 can be solved by linking Step 
2 to Step 1, so that the link can be computed if necessary. P5 can be solved by 
issuing only one anonymous token to every participant who registered in Step 1 
and accepting only one pseudonym for every anonymous token in Step 2. Note 
the similarities in the solutions for P4 and P5. 

It is interesting to note that problems with similar traits as P2 and P3 
are observed in other protocol applications as well. Non-transferability of elec- 
tronic cash [13], receipt-free electronic voting [12] and prevention of purchase of 
votes [10] are some examples for these traits of problem noticed in other protocol 
applications. 
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3.2 The Protocol Schema 

We shall now describe a three phase protocol schema that overcomes problems 
P2 through P5. We assume that all participants possess certified public keys 
that support digital signature and authentication schemes. We also assume the 
existence of a token issuer TI, whose public key yt is available to all the par- 
ticipants through a secure channel, and a supervisor S whose role is to act as a 
monitor of the system. Note that no explicit trust need be placed on S because 
all participants are required to generate publicly verifiable proofs. Let the system 
have n participants. The three phases of the schema are: 

Phase 1 Participant i generates a message Ci (for commitment), signs this mes- 
sage using its public key, say yt, sends the message and the signature, 
say Di, to TI and obtains an anonymous token, ATi, such that only 
participant i knows the ordered pair (yj, ATi). Note that ATi could be 
a blind signature or an electronic coin that can be verified using the 
public key of TI, yt- All participants must participate in this phase 
before proceeding to the next phase. This can be checked when n valid 
signature tuples, (Ci,Di), are submitted and n tokens are withdrawn 
from TI. 

Phase 2 Participant i (anonymously) submits ATi to S, proves ownership of 
ATi, submits its pseudonym Ui = secret{vi), where secret could be a 
one way function, and keeps Vi as its secret. After verifying the proofs, 
S publishes {ui, ATi) in a publicly accessible directory, along with the 
proofs. All participants must participate in this phase before proceed- 
ing to the next phase. This can be checked when n tokens are submitted 
to S. Note that to create a strong link between Phase 1 and this phase, 
the value of Ui must be a function (or part) of the anonymous token 
ATi. Or in other words, it cannot be randomly generated. 

Phase 3 Participant i chooses its reviewer to be the owner of the pseudonym Uj, 
such that j yf i, generates transcripts to prove that it knows the secret 
value corresponding to one of the n—1 public values in the set {ui \ I yf 
j} and commits to the choice by signing the choice and the transcripts 
of the proof. If S successfully verifies the proofs and the signature, 
it publishes the tuple {yi, Uj) along with the proof and signature 
in a publicly accessible directory. Participant j can query the public 
directory to know the identity of its candidate, yi. If n participants 
complete this phase and the public key used for verifying Di was used 
to verify the signature for commitment to the choice then, S announces 
the protocol to be complete. If participant n cannot prove that it 
knows the secret corresponding to one of the n — I public values in the 
set {m; I Z yf j}, then Uj must be its pseudonym. This event results 
in a deadlock. Our preliminary analysis suggests the probability that 
deadlock will occur reduces as n increases and is bounded above by the 
value l/2e, where e is the base of the natural logarithm. In this case, S 
announces the protocol to be incomplete and all the participants must 
start the protocol anew from Phase 1. 
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Since the technology used to generate ATi provides computational anonymity, 
the resulting system will provide fair peer review. If ATj can be linked to yj 
in Phase 1, then yj can be linked to yi using the information from the tuples 
(%) and {uj, yf). 

4 The Protocol 

We shall now present our proposal to realise the functionality of each phase 
mentioned in Section 3.2. We shall use the electronic coin technology [8] for the 
anonymous token facility (Section 2.3) in Phases 1 & 2, and partial proof for 
knowledge of discrete logarithm (Section 2.2) in Phase 3. 

System Setup. The supervisor of the system, S, selects a large prime p such that 
computing discrete logarithms in Zp is intractable. S also selects a generator g, 
of the group Z*. Henceforth, all arithmetic will be computed in the congruence 
class modulo p, unless stated otherwise. The token issuer, TI, possesses a public 
key yt of the form yt = where Xt €r Z* is the private key corresponding to 
yt- The tuple {g, p, yt) are published as the public parameters for the selection 
system. The supervisor maintains two bulletin boards with read permission for 
everyone and edit permission only for the supervisor. Let the two bulletin boards 
be labelled A and B. Bulletin board A will contain unselected pseudonyms and 
bulletin board B will contain the selected pseudonyms. 

Let there be n participants in the system, such that n > 4. The public 
key of participant i, yt{= g^' \ xi €r Z*), is published in a certified public 
directory with Xi as the corresponding private key. Every participant in the 
system possesses a certified public key. 

Additional system parameters required for the electronic cash technology [8] , 
which will be used as an anonymous token, are published. 



4.1 The Proposal 

We shall now present our proposal for the peer review problem, assuming honest 
participants. The next section will present modifications to this protocol that 
will greatly relax this assumption. 

Phase 1. Participant i generates and signs a message to obtain a message- 
signature tuple {Ci,Di) and, sends the tuple to TI (who verifies the signa- 
ture using i’s public key). The token will be a blind signature on a message 
by TI that can be verified using its public key yt- We will use Brands blind 
signature scheme [2] in the withdrawal protocol. Participant i chooses a ran- 
dom value Vi €r Z* and computes Ui = g'"\ It then lets Ui be the message 
to be blindly signed by TI and obtains an anonymous token ATi by execut- 
ing the Withdraw protocol of the electronic cash technology with TI. Thus, 
ATi = blindSignature(rti, • • • , yt). We refer to Appendix A for an exposition on 
the details of the withdrawal protocol. 
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Computational Requirements: The computational requirements will be the 
same as that required for the electronic cash technology. Suppose the electronic 
cash scheme proposed by Frankel, Tsiounis and Yung [8] is employed, then TI 
must perform 2n(dl + 1) modular exponentiations and each participant must 
perform 13(dl + 1) modular exponentiations, where dl denotes the number of 
times the dead lock situation occurs before the protocol completes. 

Phase 2. The following steps are performed by individual participants and S: 

Step 2.1 Participant i anonymously contacts S, presents the tuple {ATi, m) 
to S, engages in the Payment protocol with S. S checks if ATi 
contains the blind signature hy TI on Ui. 

Step 2.2 If S successfully verified the transcripts then it publishes the tuple 
{ATi, Ui) in a public directory along with the transcripts of the 
Payment protocol. 

Step 2.3 S enters Ui into A. 

All participants must complete this phase before the protocol can proceed to the 
next phase. 

Computatioual Requiremeuts: S must perform lln(dl+ 1) modular expo- 
nentiations and the participants must perform 6(dl-|-l) modular exponentiations. 

Phase 3. The following steps are performed by individual participants and S: 

Step 3.1 Participant i authenticates to S using its public key y*. 

Step 3.2 Participant i chooses a pseudonym Uj such that j i from A. 

Step 3.3 Participant i presents Uj to S along with the transcripts generated 
using the function PPKGen (see Section 2.2)with {ui \ I j} and 
Vi as the inputs to the function and its signature on the transcript 
that can be verified using its public key yi. 

Step 3.4 S verifies the transcripts sent by participant i using the function 
PPKVer with {ui \ l ^ j} as the input to the function. If it correctly 
verifies the transcripts and the signature on the transcripts using the 
public key yi, it removes Uj from A, adds Uj to B and publishes the 
tuple {uj, yi) in a public directory. 

Step 3.5 Participant j can consult with the public directory to find yi as its 
candidate to be reviewed. 

When participant n engages in this protocol, there will be only one entry in 
A. If the last entry happens to be Un (the pseudonym of participant n), then 
a deadlock is said to have occurred. In this case participant n cannot generate 
valid transcripts in Step 3.3, as it will not know possess the knowledge of dis- 
crete logarithm for any of the objects in the set {ui \ I ^ j}. Participant n then 
must prove that it knows the discrete logarithm of Uj using the protocol con- 
struct described in Section 2.1. IIS' successfully checks this then it publishes the 
transcripts sent by participant n along with its signature on the transcript and 
announces the protocol to be incomplete. In which case all participants must 
restart the protocol from Phase 1. 
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Computational Requirements: S must perform (n? — n)(dl + 1) modular 
exponentiations and each participant must perform (2n — 3)(dl + 1) modular 
exponentiations (see Section 2.2). 

Anonymity Revocation: In the system an additional entity T, the trustee, will 
be involved to facilitate anonymity revocation. Then S, TI or any other autho- 
rised entity can engage in a Trace protocol (see owner tracing in [8]) with T to 
obtain/compute the tuple {yi, ATi), which can link yi to Ui when the public 
information {ATi, Ui) is used. To reduce the level of trust that must be placed 
on T, its functionality can be distributed. 

4.2 Security Aualysis 

This section will present an analysis of the phases to elucidate its achievement 
of the desired properties. 

Property 1: It achieves permutation without any fixed points. In Phase I, when 
participant i authenticates to T I using its public key yi, it receives only one 
ATi. If more than one token was issued to participant i using yi, then TI 
can be held responsible (all transcripts are publicly verifiable and signed by 
individual entities) . Phase 2 allows only one pseudonym to be submitted for 
every ATi ■ Phase 3 requires participant i to prove its knowledge for at least 
one pseudonym in the set of pseudonyms that does not contain its choice. 
In order to pass this phase, participant i cannot choose itself. Thereby, the 
protocol is a permutation without fixed points. 

Property 2: Since every user is allowed to submit only one pseudonym and 
selects a different pseudonym (in the same category of pseudonyms), every 
reviewer is also a candidate. 

Property 3: Reviewers are anonymous from the candidate and the candidate is 
not anonymous from the reviewer. Since every user chooses the pseudonym of 
its reviewer after authentication (using the public key, say yi), this choice is 
public and the reviewer (say Uj) can know the identity of the candidate. From 
the publicly known tuples {ATj, Uj) and (uj, yi), candidate i cannot know 
the identity of reviewer j, if the technology used for generating anonymous 
tokens does provide anonymity. Candidate i cannot obtain the tuple {yj, Uj) 
by observing the protocol runs in Phase 3, if the proof system used is witness 
indistinguishable. Proposition 1 provides the proof for this property. 

Proposition 1 The system provides anonymity service to the reviewers. 

Proof: Assume that the participants do not collude and the electronic cash 
technology prevents any entity other than participant i to compute the tuple 
{yi, ATi). S, by itself or in collusion, cannot correlate between the values yi 
and Ui, using the public knowledge {ATi, Ui). Since the functions PPKVer and 
PPKGen are witness indistinguishable (see [4]), S, by itself or in collusion, can- 
not correlate the value yi with Ui using the outputs of the function PPKGen, as 
computed by participant i. □ 
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If the proof systems used for the anonymous token technology and partial 
proof of knowledge protocol construct are publicly verifiable, then the trust level 
on the token issuer, TI, and the supervisor, S, can be considerably reduced. 
The advantage of this approach is that it does not make any assumptions on the 
possible inclusion of anonymity revocation mechanism. This is the advantage of 
abstracting anonymous token, ATi, to provide this service. Anonymity revoca- 
tion mechanisms can be built into the token technology without affecting other 
core functionalities of the protocol (permutation without fixed points) . 

5 Discussion 

We proposed a protocol schema that can be employed to design a peer review 
system. If we look at the peer review process from a higher abstraction, it is 
evident that peer review is a secure selection process with one-way anonymity 
service, most of which is achieved only in Phase 3. 

The generic selection protocol 7^, is a mapping defined as V : S C with 
additional constraints to achieve specific security services, where S is the set of 
selectors and C is the set of choices. The peer review system is a special case of 
the selection protocol when, S is the set of reviewers, C the set of candidates 
and S = C with constraints to ensure no-fixed point and one-way anonymity for 
participants in S. 

There are similarities between the basic peer review process and other pro- 
tocol problems, as follows: 

1. Electronic Voting: Participants in the set of voters S cast a vote that 
selects an entity from the set of candidates C. Only the participant knows 
the selection or the voter- vote relation is known only to the voter. To achieve 
this goal, confidentiality of information on selection can be translated to 
confidentiality of information on the identity (anonymity) of the selector 
(voter) . 

2. Contract Bidding: Participants in the set of bidders S commit to a bid 
that selects a value from the set of bid options C. Only the participants 
knows the value of the bid. 

3. Conference Paper Review: Reviewers from the set of programme com- 
mittee members S select objects from the set of submitted papers C with 
constraints on the selection, like the member must not be from the same 
organisation from where the paper originated etc. The reviewer should not 
know the details pertaining to the authors of the paper and the author of 
the paper should not know the identity of the reviewer. This seems to be 
a very complex system primarily due to the number of possible constraints 
and the requirement for two way (mutual) anonymity. 

Some of the seemingly conflicting requirements that are evident in these prob- 
lems, as in the peer review problem, are: 
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1. authentication and anonymity; 

2. confidentiality and validation of the information. 

Future research will be directed towards solving these protocols by employ- 
ing the method described in this paper. It is also planned to investigate the 

possibility for obtaining an exact expression for the probability of deadlock. 
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In this section a brief overview of the fair off-line cash scheme, proposed by 
Frankel Tsiounis and Yung [8,9], will be provided. 

System settings The mint chooses primes p and q such that p — 1 = 6 + k for a 
specified constant <5, and p = ')q+l for a small integer 7. A unique subgroup Qq of 
prime order q of the multiplicative group Zp and generators 5, 51, 52 of Qq are de- 
fined. The mints secret key Xb &r Zq is created. Hash functions H, Ho, Hi, ■ ■ ■ , 
from a family of correlation-free one way hash functions are defined. The mint 
published p, q, g, gi, g2, {H, Ho, Hi, ■ ■ •) and its public keys h = g^’^ , hi = Pi’^ , 
/i2 = g^'^ ■ The public key of the trustee, T, of the form /2 = g^^ is also pub- 
lished, where Xt &R Zq. Note that T should be a distributed entity to reduce 
the level of trust placed on it. 

The mint associates the user with the identity I = where ui Gr Qq is 
generated by the user such that 5“^ 52 7^ 1- The user is expected to prove the 
knowledge of discrete logarithm of I w.r.t. gi. The user computes z' = h“^/i2 = 

{192)^^- 

Function Withdraw: This protocol creates a restrictive blind signature on I, 
so that at the completion of the protocol the user obtains a valid signature of 
the mint on {Ig2Y for a random secret value s known only to the user. The 
signature verification equation sig{A, B, Ui) = (z, a, b, r) satisfies the following 
equation: 

In our proposal the anonymous token ATi would be the tuple (ui. A, B, z, a, b), 
h the public key of the token issuer TI and Ui the pseudonym of participant i 
whose identity in the system is I. The withdrawal protocol is of the form: 



User 




Mint 


S, Vi Zq ^ 


a' ,b' 


W £r Zq 

- a' ^ g^ ,b' ^ {IgzT 


A = (192)“ and Ui = g''‘ 
2 = 2'“ 

Xl,X2,U,V £r Zq 

Bi = gY 
B2 = gY 
B^[Bi,B 2 ] 
a = (a') V 
b = 

c = H{ui,A,B,z,a,b) 






c' = c/u - 


c' 


r' = c' Xb + w mod q 


r = r'u + V mod q <- 


r' 


— 



The user verifies if H a' , {Ig2Y = b' . 
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Function Payment: This protocol is performed in an anonymous channel. The 
following table provides the sketch for the protocol. Note that, in the protocol, 
the tuple (Di, £>2) is an ElGamal encryption of the identity of the user I under 
the public key of T. Also note how the identity of the shop Js is bound to the 
coin when the challenge d is generated by the shop. 



User 




Shop 


m Gr Zq 


Dl ,D 2 


7 


Di = D 2 = gT 


uj , ,A2 ,A,B,{z,a,b,r) 


D2 / 1 

7 ? 


V = Hi{{Dr/{f2rn 




A = A1A2, A 7^ 1 

sig{A,B, Ui) = (z,a,b,r) 
d = Hi{Ai, Bi,A2, i?2,2's, date/time) 
So, Sl, S2 Gr Zq 

D' = 

f2 = f2°92^ 


ri = d{uis) + xi 


r\ ,r2 , V 


9 


V2 — ds + X2 











Function Deposit: The shop deposits the payment transcripts to the mint, 
which checks the transcripts using the same checking equations that the shop 
employed during the payment protocol. If the equations hold then the mint 
deposits a suitable amount into the shop’s account. 

Function Trace: To trace the identity of the user who engaged in the payment 
protocol that resulted in a particular deposit, the mint contacts T and presents 
the transcripts submitted by the shop. T can then decrypt the information on 
identity using the ElGamal ciphertext tuple {Di, D2)-System settings The mint 
chooses primes p and q such that p — 1 = 6 + k for a specified constant 6 , 
and p = jq + 1 for a small integer 7 . A unique subgroup Qq of prime order q 
of the multiplicative group Zp and generators g,gi,g2 of Qq are defined. The 
mints secret key Xb &r Zq is created. Hash functions ■ ■ ■ , from a 

family of correlation- free one way hash functions are defined. The mint published 
p,q,g,gi,g2, {H, Ho, Hi, ■ ■ ■) and its public keys h = g^^ , hi = g^^ , /12 = g^^ ■ 
The public key of the trustee, T, of the form /2 = g^^ is also published, where 
Xt &r Zq. Note that T should be a distributed entity to reduce the level of 
trust placed on it. 

The mint associates the user with the identity I = g£ , where ui Gr Qq is 
generated by the user such that g£ g2 £ 1 - The user is expected to prove the 
knowledge of discrete logarithm of I w.r.t. gi. The user computes z' = /i“^/i2 = 
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Abstract. We propose the efficient password-based key exchange protocol, 
which resists against dictionary attack mounted hy a passive or active adversary 
and is a 3-pass key exchange protocol, whereas existing protocols are 4-pass or 
more. Thus, considering network traffic, it will he able to reduce the total 
execution time in comparison with other several schemes. Especially, from the 
view point of the client’s computational cost, our protocol is suitable for mobile 
communications. It is because we can reduce the modular exponentiation of 
client (or mobile) in comparison with other several password-based protocols. 
Besides, the proposed scheme has the characteristics of perfect forward 
secrecy, and resists against a known key attack. It also offers resistance against 
a stolen verifier attack as A-EKE, B-SPEKE, and SRP. Finally, two parties 
involved in protocol are able to agree on Diffie-Hellman exponential g"’ in the 
proposed scheme. 



1 Introduction 

Generally, a public key cryptosystem is more convenient than a secret key 
cryptosystem in key establishment, but it is not easy to use due to the unmemorizable 
secret key or the difficulty of public key infrastructure establishment. To solve these 
problems, Bellovin and Merrit proposed the password-based protocol, which is 
Encrypted Key Exchange (EKE) [1], that allows both parties to share a common key 
by using only password. Later, several schemes have followed [2-6]. These 
password-based protocols are not a limited scheme in comparison with public key 



* This work was supported hy KSEF (Korea Science and Engineering Foundation) under 
project 97-01-13-01-05 

JooSeok Song (Ed.): ICISC’99, LNCS 1787, pp. 147-155, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




148 Hyoungkyu Lee et al. 



scheme using certification. Since users can accomplish their purpose (eg, key 
exchange) by using only passwords, they don’t need to use certification and long 
secret key. However, these password-based protocols must be immune to password- 
guessing attacks such as a dictionary attack, where the dictionary means a list of 
probable passwords, because an adversary can use the dictionary to guess the correct 
password by on-line or off-line. In most cases, it is difficult to detect off-line 
password-guessing attacks since an adversary tries to verify guessed values using 
publicly available information. However, on-line password-guessing attacks are 
easily thwarted by counting access failures since an adversary use the guessed value 
whenever he logs in. Thus, our main concern is to protect passwords from off-line 
password-guessing attacks. The password-based key exchange protocol is divided 
into two branches [5]: One is a plaintext-equivalent mechanism [1,3], the other is a 
verifier-based mechanism by secret public key [2,4-6, 10]. In the verifier-based 
mechanism, password and verifier correspond to private and public key. Recently, in 
aspect of computational cost, Kwon and Song [6] proposed more efficient protocol 
than other several protocols such as the combined B-SPEKE and the optimized SRP 
[4, 5]. But, the total execution time of protocol depends on the number of protocol 
steps rather than computational cost. The protocol proposed in [6] is a 4-pass key 
exchange protocol as well as those in [4,5]. In this paper, we propose a 3-pass 
password-based key exchange protocol whose computational cost is similar to [6]. 
Also, the proposed scheme doesn’t require a safe prime or a primitive root to thwart a 
partition attack or a subgroup confinement attack in comparison with several 
protocols proposed in [1-4]. Therefore, we suggest the use of large prime-order 
subgroup for efficiency. This paper examines security and efficiency of protocol. In 
section 2, we describe and analyze several protocols and standard techniques based 
on password. In section 3, the new key exchange scheme is proposed. In section 4, 
we examine the security of proposed scheme, while we examine the efficiency of 
proposed scheme in section 5. Finally, section 6 is conclusion. 



2 Historical Review of Existing Protocols 

In this section, we review several password-based key exchange protocols. In such 
protocols, the client performs key exchange by using only the secret password or 
something equivalent to it. On the other hand, server comes to perform key exchange 
by two mechanisms [5]: As we mentioned earlier, one is the plaintext-equivalent 
mechanism that both client and server can access to the same secret password or 
something equivalent to it. Thus, in the plaintext-equivalent mechanism, there is no 
difference of knowledge between the client and the server. The other is a verifier- 
based mechanism where the verifier has similar properties to a public key. That is, the 
verifier is easily computed from the password whereas obtaining the password from 
the verifier is computationally infeasible. In the verifier-based mechanism, the server 
cannot access to the secret password. The server can access to only the verifier. On 
the other hand, the client can access to both password and verifier. Thus, there is the 
difference of knowledge between the client and the server. Of course, the verifier is 
kept secret by the server. 

As shown in Table 1, we summarize the characteristics of existing protocols. In 
Table 1, f() and h() represent public one-way functions stretching or hashing the 
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secret exponent such as a password. Generally, only client knows the plain password 
P. In SRP and KS, server stores the salt s or s©P, and sends it to client. 

Table 1. The Characteristics of Existing Protocols 

G : Primitive Element P: Password 









X, y, u : 


Random Number 


s : Salt 


Protocol 


Client 


Server 


Session 

Key 


Modular p 


Authentication 

Mechanism 


DH-EKE 


h(P) 


h(P) 




safe prime 


plaintext 


SPEKE 


h(P) 


h(P) 


h(p)2^y 


safe prime 


plaintext 


A-EKE 


S(P), V(P) 


V(P) 




safe prime 


verifier 


B-SPEKE 


P, h(P) 


h(P), g" 


h(p) 2 xy 


safe prime 


verifier 


SRP 


k=h(salt,P) 


s, 


gXy+uky 


safe prime 
(recommendation) 


verifier 


KS 


f(salt,P) 


S0P, gf(*.P) 




non-smooth prime 


verifier 



In 1992, DH-EKE is introduced by Bellovin and Merritt [1]. It is a combination of 
asymmetric and symmetric cryptography that allows two participants to share a 
common key by plaintext-equivalent mechanism. In DH-EKE, two participants 
encrypt their key material with symmetric cryptosystem using the shared hash value 
of secret password as a key. Then, they have come to share the Diffie-Hellman 
exponential Besides, even if legitimate users pick bad passwords, a reasonable 
level of security is maintained. However, this scheme has several shortcomings. That 
is, it requires a safe prime and primitive root due to the use of symmetric 
cryptosystem such as DES. Also, it requires the careful choice of symmetric 
cryptography. ; Otherwise, it is vulnerable to information leakage or partition attack 
[1]. In 1996, Jablon proposed a noble scheme that is called SPEKE (Simple Password 
Exponential Key Exchange) [3]. In contrast with DH-EKE, SPEKE does not require a 
symmetric cryptography. In this scheme, Jablon proposed that the base of group used 
in Diffie-Hellman key exchange should be chosen as a function of password. That is, 
h(P) is used as the base of group where h() is a hash function and P is a password. 
However, it also requires a safe prime to thwart the subgroup confinement attack 
which middle person confine key materials such as g"‘ mod p in Diffie-Hellman key 
exchange to the subgroup of small order [3]. Later, several verifier-based protocols 
including the extended versions of above two protocols were followed [2, 4-6]. These 
protocols were motivated by password file compromise such as a stolen verifier 
attack [5]. That is, the adversary obtaining verifier should still perform an off-line 
password-guessing attack using a dictionary to impersonate client. In this section, we 
go into more details about these protocols. DH-EKE is extended to A-EKE, while 
SPEKE is done to B-SPEKE [2, 4]. As shown in Table 1, A-EKE uses ]S(P), V(P)j 
as a private/public key pair for digital signature [2]. In A-EKE, digital signature is 
used to prove client’s knowledge of the password. Similarly, B-SPEKE authenticate 
the client by a second Diffie-Hellman method instead of digital signature (see section 
3 in [4]). In 1998, SRP was introduced by Wu [5]. Compared with B-SPEKE, SRP 
has the advantage of being more flexible in that it performs the key exchange without 
the base chosen as a function of password. Eurthermore, SRP has less execution time 
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than that of B-SPEKE. However, SRP cannot agree on the correct Diffie-Hellman 
exponential g**" due to the use of another random integer u for security (see section 3 
in [5]). The agreed exponential of SRP is as shown in Table 1. Recently, Kwon 
and Song proposed a new scheme, denoted by KS in Table 1, for the correct Diffie- 
Hellman exponential g**" [6]. Compared with the related protocols, their protocol is 
simple and efficient. Unlike A-EKE and B-SPEKE, both SRP and KS use the salt. 
Especially, KS uses s©P which is exclusive-or of password and salt whereas SRP 
uses salt transmitted as a plaintext. However, we have to note that such use of salt 
still can not preclude an adversary from obtaining information on password by stolen 
verifier since the entropy of password is low. 



3 Proposed Scheme 

In this section, we introduce a new scheme reduced as a 3-pass without compromise 
of security and performance. The user name Alice and Bob correspond to client and 
server, respectively. h( ) is the hash function, and a is the element with large prime 
order q in GF(p) . Letf() be stretch function which extends the bit length of pre-image 
to that of secure exponent. Also, we omit mod p for simplicity when we describe our 
equation. We summarize the details of a notation in Table 2. 



Table 2. The Details of a Notation in Proposed Scheme 



ID 


User’s name or address 


P 


Large prime modular 


q 


Large prime factor of p- 1 


g 


Primitive element in GF(p) 


a 


Element of order q in GF(p) 


P 


Password 


V 


Verifier stored in server’s storage 


X, y, r 


Randomly chosen integers 


e, t 


Alice’s transmissions for key exchange and authentication 


h() 


Public hash function 


f() 


Public stretch function 


K 


Session key 



The key exchange and authentication are performed by verifier-based mechanism 
using extra Nyberg-Rueppel one-pass scheme [7]. In this paper, the terms, extra 
means conversion of a public key scheme into a password-based scheme. Thus, in 
Nyberg-Rueppel one-pass scheme, Bob’s secret key is removed and Alice’s secret 
key is replaced by f(P). First, we start with password setup steps. To establish a 
password P with Bob, Alice computes Bob stores v with ID as Alice’s 

verifier. For authentication and key agreement, Alice selects two random integer x 
and r, where x, r ejl, 2, ... ,q-lj, and computes e= d' and t=r+ef(P) mod q. Then, 
Alice sends to Bob (e, t) with ID, initializing the key exchange. 
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Alice ^ Bob : ID, (e, t) (1) 

Bob looks up Alice’s verifier, and obtains d by computing (g-v-e) from message 
(1). Then, he chooses a random integer, yejl, 2, ... ,q-lj, and computes K=cf as a 
session key. Bob sends d with h(K, e, t) to Alice. 

Bob Alice ; d, h(K, e, t) (2) 

Alice can also compute K=a” and a corresponding hash image from message (2). 
Then, she checks if the hash images match each other. If it is satisfied, Alice sends 
h(K, d) to Bob. 

Alice ^ Bob : h(K, d) (3) 

Bob can also compute the hash image corresponded with Alice’s message (3). If 
the hash images match each other, then both parties agree on the Diffie-Hellman 
exponential, i.e. d^. Simple description of each step is in Figure I . 
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Bob 


1 


e=ay-' 
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ID, e, t 


3 




Look up : (ID, v) 


4 






5 




cJ 


6 




K=(a*)r 


7 


K=(a/)^ 


o9',h(K,e,t) h(K,e,t) 


8 


Verify h(K, e, t) 




9 






10 




Verify h(K, of) 



Fig. 1. The Proposed 3-Pass Password-Based Key Exchange Protocol 



4 Security Analysis 

A password-based protocol must not leak the information about the passwords. The 
leakage of such information may cause the verifiability that allow an adversary to 
guess the password. Thus, an important requirement of password-based protocol is 
that it must be immune to password-guessing attacks such as a dictionary attack. It 
can be realized by a protocol where its transmissions offer unverifiability for guessed 
passwords. Typically, password guessing attack proceeds as follows (Also, see [9]) : 

• A password is stored in a computer file as the image of an unkeyed hash function. 
When a user logs on and enters a password, it is hashed and the image is compared 
to the stored value. An adversary can take the hash values of guessed passwords 
using dictionary which is a list of probable password. Subsequently, if an 
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adversary compare this to the list of true encrypted passwords, he or she may find 
the correct password. 

The proposed protocol is worked by secret public key referred to as verifier. It 
resists against dictionary attack mounted by passive or active adversary. Also, by 
Diffie-Hellman problem, the proposed scheme has the characteristics of perfect 
forward secrecy and is protected from a known key attack such as Denning-Sacco 
attack [11, 15]. The details are as follows ; 



4.1 Protection from a Passive Adversary 

A passive adversary can eavesdrop all transmissions over protocol and perform off- 
line password-guessing attack. In the above protocol, an adversary can obtain (e, t), 
d, and the hash values. But, an adversary cannot learn any useful information from 
which he can guess the correct password. 

4.2 Protection from an Active Impersonator 

An active impersonator can masquerade as a legitimate user. Also, he can modify the 
transmissions of legitimate user. For description, let A* and B’ be impersonator with 
the guessed value P* for Alice and Bob respectively. First, A* can send to Bob (e, t) 
where e=d'\ t=r+ef(P‘) mod q. Then, Bob computes as a session key. 

Thus, A* cannot obtain the correct Diffie-Hellman exponential cf. Also, A* cannot 
perform the off-line password-guessing attack since he cannot know y mod q by 
discrete logarithm problem. In a similar way, B* is also detected easily since he 
cannot generate h( (f, e, t) of message (2). 



4.3 Resistance against a Stolen Verifier Attack [5] 

In the proposed scheme, Alice has to know f(P) to agree on Diffie-Hellman 
exponential. Thus, even if an active adversary can obtain verifier v, he cannot 
impersonate Alice. That is, to impersonate Alice, an adversary has to try to guess the 
effective f(P) from Note that Bob is always a possible enemy by his own 
verifier. 



4.4 Resistance against a Partition Attack and a Subgroup Confinement Attack 
[1, 3] 

To thwart a partition attack [1], the proposed scheme was designed not to leak 
information about the password. It is possible because our protocol is generalized in 
such a group that has the fixed base without other cryptographic technology. Also, a 
subgroup confinement attack [3] can be easily thwarted by the use of large prime- 
order subgroup. That is, an adversary cannot confine transmissions of each 
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participant to the subgroup with small order. Thus, in this paper, an adversary cannot 
perform the password guessing attacks such as a subgroup confinement attack or a 
partition attack because all computations are performed over large prime-order 
subgroup. 



4.5 Comments on Parameter 

In 1978, Pohlig and Heilman introduced technique known as Pohlig-Hellman 
decomposition can be used to reduce the running time by decomposing the original 
large discrete log problem into a number of smaller such sub-problems [14]. By 
pohlig- Heilman decomposition, given a group GF(p) of order p-1, primitive element 
g and g\ the running time of finding x becomes 0(q ) where p=qw+l, is a large 
prime factor and w is the product of smooth factors. 

Oorschot et al indicated that the basic Diffie-Hellman key exchange had a 
potential drawback [13]. They showed that the shared key of both participants would 
become g’^‘‘ by middle person attack. Thus, middle person attacker could easily find 
the session key, K=g’^‘‘, by exhaustive search. As we mentioned above, this attack is 
referred as subgroup confinement attack. Consequently, it motivates the use of 
subgroups with large prime-order. The use of prime-order subgroup has the 
advantage in constructing prime p in comparison with that of safe prime. The prime- 
order subgroup requires only that q is secure exponent, whereas a safe prime requires 
guaranteeing a large prime divisor q=(p-l)/2. Thus, compared with existing 
password-based protocols, our protocol has the flexibility in constructing p and 
precludes a subgroup confinement attack. See also [12] for the prime-order subgroup. 



5 Efficiency 

The proposed scheme has two important advantages in light of efficiency. One is the 
number of protocol steps and the other is the low computational cost for client. The 
former is described by the number of message exchanges required between the parties 
involved in the protocol. The proposed scheme is reduced to three steps. Thus, 
compared with the related schemes [4, 5, 6], the proposed scheme can be more 
suitable to remotely distributed environment. It is because the total execution time of 
protocol depends on the number of protocol steps rather than computation time. 
Consequently, the proposed scheme will be able to reduce frequent disconnections or 
transmission errors by network traffic. In light of each participant’s computational 
cost, it is efficient too. Typically, the computational cost of protocol can be 
summarized as the number of modular exponentiation computed by each participant. 
In our protocol, the number of modular exponentiations required to the client is two, 
to the server four. On the other hand, in [6], the number of modular exponentiations 
required to the client is four, to the server three. Considering above results, we can 
also apply our protocol to mobile communication environments. Generally, mobile 
communication environments require a low computational cost for a mobile device 
(see section 2 in [8]). That is, since mobile devices are made small and light to be 
portable, they usually come to have comparatively fewer resources and computational 
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power than those of server or base station. Also, though the total execution time of 
protocol is much related with the number of protocol steps, we try to represent the 
total amount of execution time in parallel as shown in [6]. It is as follows: For 
E(client : server), E{g‘'' : ), E( : g) and E( :v‘) between step 1 and 2, and E( :g^), E( 
'■(g’f) and E((g^f: ) between step 2 and 3. Thus, the proposed scheme requires six for 
three steps whereas KS proposed in [6] five for four steps. These facts are well 
described in Table 3. We exclude the execution time of A-EKE from Table 3 due to 
dependence on digital signature scheme. 



Table 3. The Amount of Computational Cost 



Protocols 


The Number 
of 

Modular Exponentiations 


Total 

Execution time 


The Number 
of 

Passes 




Client 


Server 


(in Parallel) 


Combined 

B-SPEKE 


3 


4 


7 


4 


Optimized 

SRP 


3 


3 


4 


4 


KS 


4 


3 


5 


4 


Proposed 

scheme 


2 


4 


6 


3 



6 Conclusion 

Several password-based key exchange protocols have been presented ever since DH- 
EKE is introduced by Bellovin and Merritt. Such password-based protocols were 
designed to have strong authentication without depending on external infrastructure. 
In this paper, we described the history and properties of such password-based 
protocols, and proposed a new 3-pass password-based key exchange protocol which 
both participants could agree on the correct Diffie-Hellman exponential g"^. The 
proposed scheme has excellent performance without compromise of security in 
comparison with existing protocols. As we mentioned earlier, the proposed scheme 
can be suitable for mobile communications due to the low computational cost of 
client. Erom the viewpoint of the number of protocol steps, we proposed more 
efficient 3-pass scheme than several 4-pass schemes such as [2, 4-6]. Besides, the 
proposed scheme requires less restriction against group parameter in comparison to 
the related schemes such as A-EKE or B-SPEKE. 
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Abstract. We present a new 2-pass authentication and key agreement 
protocol for mobile communications. The protocol solves the weaknesses 
of the PACS and Zheng’s 1.5-move protocol for air-interface of mobile 
systems. The paper outlines the new protocol, examines its various as- 
pects, and compares them to those of other 2-pass protocols of the PAGS 
and the 1.5-move protocol. 



1 Introduction 

The demand for a variety of value-added services, such as electronic commerce, 
continues to grow in the mobile communications. It is likely that public key based 
techniques will be rapidly employed in the implementation of security services 
for mobile communications due to its facility of key management and variety of 
security services including digital signature. 

To provide authentication and key establishment between two involving enti- 
ties in a mobile communication system, several public key based 2-pass protocols 
have been presented, including the PACS protocol [1] and the 1.5-move protocol 

[2] . The PACS is a standard adopted by ANSI for personal communications sys- 
tems. It is based on Bellcore’s WACS (wireless access communications system) 

[3] and on Japan’s PHS (personal handyphone system) [4]. The 1.5-move pro- 
tocol was proposed by Zheng. Its interesting feature is that most operations for 
mobile user device can be done in off-line. These schemes were designed to meet 
certain requirements that occur specifically in a mobile communication environ- 
ment. These requirements include user anonymity, shortage of radio bandwidth, 
and computational limitation of mobile user device. 

Most public key based authentication and key agreemet (hereinafter AKA) 
protocols use either 2-pass or 3-pass scheme to exchange messages. The 2-pass 
method includes advantages such as high bandwidth efficiency and rapid connec- 
tion achievement, while the 3-pass scheme can provide more variety of security 
services. It would be preferable if the advantages of both schemes could be com- 
bined while satisfying more requirements. 
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This paper presents a 2-pass public key based AKA protocol for mobile com- 
munications. The protocol provides non-repudiation of the mobile user to the 
base station, yet satisfies the merits of the 2-pass scheme. Various aspects of the 
protocol are examined and then compared with those of other existing 2-pass 
solutions. 



2 Properties for Mobile Communications 

The security features required for a protocol providing authentication and key 
establishment between a mobile user and base station have been identified in 
previous literature. Horn and Preneel [5] recommended some goals for wireless 
communication, while the European ASPeCT project [6] extended this to ini- 
tialize a mechanism for enabling payment of the value-added service. 

Since this study is primarily interested in achieving AKA between a mobile 
user and a base station, some of these properties associated with just payment 
initialization have been ignored. 

Definitions of security services for the AKA are well specified in the American 
National Standard X.9.63-199x[7], and also described in [5,6,8]. For this study, 
the properties requiring consideration in the development of a public key based 
protocol for mobile communications were as follows. 

51 Entity Authentication: The assurance provided to entity U that entity 
V has been involved in a real-time communication with entity U. 

52 Key Authentication: For a clearer description, this property is classified 
into two categories; implicit key authentication and explicit key authentica- 
tion. 

- Implicit key authentication: The assurance that the corresponding 
key is possibly computed by only the involving entities. 

- Explicit key authentication: The assurance that the corresponding 
key is possibly computed by only the involving entities, and the corre- 
sponding key is actually computed by only the involving entities. 

53 Key Agreement: A key agreement scheme is a key establishment scheme in 
which the keying data established is a function of the contributions provided 
by both entities in such a way that neither party can predetermine the value 
of the keying data. 

54 Assurance of Key Freshness: The assurance provided to each entity that 
a new establishing key is randomly fresh so that attacks based on use of in- 
formation associated with compromised keys or previous data are prevented 

55 Anonymity of Mobile User: This property provides confidentiality of the 
mobile user’s location and movement. This property is also one of the most 
important requirements in mobile communications 

56 Non-repudiation of User: This service guarantees undeniable evidence 
related to the user charge or important data. The significance of this service 
is increasing as a variety of value-added services are expected 
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An AKA protocol must satisfy certain requirements that occur specifically in 
a mobile communication environment, including bandwidth limitation on the air 
interface and computational performance limitation of the mobile user device. 
Accordingly, the following efficiency properties need to be fully considered when 
building a protocol. 

R1 Minimum Number of Passes: To reduce a latency time, the number of 
message exchanges required between entities should be minimal 
R2 Efficient Usage of Bandwidth: Due to the high cost of bandwidth on 
the air interface, the total number of bits transmitted should be kept as 
small as possible 

R3 Low Computational Load: The computational load for each entity 
should be small. In particular, since the on-line performance of a mobile 
user device is limited, it is desirable to reduce the load at the mobile side by 
employing off-line computation. 

3 The New Proposal 

3.1 Protocol Description 

Throughout this paper the application of an elliptic curve cryptosystems [9,10,11] 
and the use of the notation in Table 1 are assumed. For clarity of exposition, 
optional data are omitted. 



Table 1. Notation and its meaning. 



Notation 


Meaning 


IDe 


An identifier of the entity E 


RT{TS) 


A real time value(a time stamp) 


hash{x) 


The result of hashing to input x 


Xe 


A secret key of the entity E 


Pe 


A public key of the entity E, Pe = xe ■ G 


Kmb 


The common session key between M and B 


G 


A generator point of the elliptic curve 


CertE 


A certificate of the Pe 


SigE{x} 


A value X signed by the entity E 


ve 


A random number generated by entity E 


x\\y 


Concatenation of x and y 


Ek{x}{Dk{x}) 


The symmetric encryption (decryption) of x using key K 


M 


A mobile station(user) 


B 


A base station(service provider) 



Figure 1 shows a flow diagram of the ordinary operation of the protocol in 
which flows are faithfully relayed between two entities M and B. Points on an 
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M B 

rB\\CertB\\RT 

Kmb ~ hash{rB\\RT\\{rM ■ Pb)) 

T = rM-G 

m = {IDM\\IDB\\rB\\RT) 
r = hash{m\\T) 
s = rM/{r + xm) mod q 
c = i?KMB{’'lkl|C'ertM} 

^ T\\RT\\c^ 

K*mb = hash{rB\\RT\\{xB ■ T)) 

Dk* {c} 

T 7 — s* ■ (s* • r*) ■ G 

r 7 — hash{m*\\T) 



Fig. 1. Flow diagram of the new protocol. 



elliptic curve such as tm • Pb and xb ■ T are regarded as binary strings when 
involved in hashing. 

The procedure of the developed protocol is executed as follows. 
Prerequisites: It is assumed that the set of the elliptic curve domain parameters 
associated with the flow have been validated. It will also be shown that the use 
of the elliptic curve system is the best choice to satisfy the requirements of R2 

and R3. 

The first pass from B to M : 

- B broadcasts a random number a real-time value RT, and its public key 
certificate Certs, where the identifier IDs and B's public key xb ■ G are 
included in the certificate CertB- 

The second pass from M to B : 

- Entity M executes the following actions: 

1. Extract IDs and Pb from Certs 

2. Generate a random number tm 

3. Compute a common session key Kmb = hash{rB\\{rM ■ Pb)) and tem- 
porary key T = rM • G 

4. Compute r and s which serve as signature 

5. Encrypt r, s, and some other data using Kmb 

6. Send T and RT to B together with the encrypted message 

- Entity B executes following actions: 

1 . B takes T and RT from the received message and compute the common 

session key = hash{rB\\RT\\{xB ■ T)) 

2. Decrypt the received message c using K’^g and extract the M’s identifier 
and public key from Cert\j 

3. Compute (s* ■ Pm + {s* ■ r*) ■ G) and compare with the received T. It 
aborts the protocol if the received T is invalid 

4. Compute hash(m*\\T) and compare with the decrypted r* 
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If s* is equal to 0, the public key of mobile station Pm is not used for signature 
verification. Hence both participants should confirm that s or s* is not equal to 

0. 

3.2 Assessments on Signature Scheme 

In this subsection, we describe the relationship between the presented protocol 
and Zheng’s signcryption scheme [12,13]. The details of the signcryption scheme 
is explained in appendix A. 

In the presented protocol, r and s is calculated in a similar way as that of 
Zheng’s signcryption scheme. Comparing with unsigncryption process of Zheng’s 
signcryption scheme, the presented protocol has distinctive features as follows. 

1. The anonymity of the mobile user is provided: 

In the presented protocol, the base station compute the decryption key, 
Kmb = hash{rB\\RT\\{xB • T)), without the knowledge of the mobile user’s 
public key. Pm- However this is not applied to the Zheng’s signcryption 
scheme as described in appendix A. It is noticeable that the disclosure of the 
tm • G does not reduce the security of the signcryption scheme [12]. 

2. It has directly verifiable non-repudiation procedure: 

When the mobile user denies the fact that he signed a message, the base 
station forward a signature (r, s) and corresponding message m with the 
certificate of the mobile user, CertM, to a judge. Then the judge can settle 
this dispute just by verifying the followings. 

k = s • Pm + {s • r) ■ G and r = hash{m\\k) 

However, Zheng assumes a trusted judge or the use of zero knowledge pro- 
tocol to prevent the disclosure of xb- 

3. It prevents a key recovery attack by a judge: 

Petersen and Michels [14] showed that the judge can decrypt any further 
message after he once settled a dispute between the mobile user M and 
the base station B. In case of Zheng’s signcryption scheme, the judge gets 
E{= u- Pm + u-{r-G), where u= s-xb mod q) after he settles a dispute. He 
can compute the temporary value Kbh = • E+ {—'>') • Pb = xm ■ {xb-G). 

Then, for any signature (r* , s* ) he can compute E* = s* ■ K dh + s* ■ {r* ■ Pb) 
and = KE*)- 

However, the modified scheme prevents key recovery attack because the judge 
gets E' = s ■ Pa + ■?■ {r ■ G) after he settles a dispute and can’t compute 
Kdh with E' . 



3.3 Considerations 

An examination as to whether the requirements SI to S6 are satisfied by the 
protocol follows: 
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- SI: Since M makes a signature on the random number rs transmitted by 
B, the verifier B can authenticate M. While M can not authenticate B. 

- S2: Explicit key authentication to B and implicit key authentication to M . 
Since only the one who knows tm and xm can compute Km b and a signature 
(r, s), and moreover, the computed signature is encrypted with Kmb, the 
verifier B can assure that Kmb is actually computed by the signer M. Since 
the key Kmb can only be computed by a holder of the secret key xb, the 
sender M can have an implicit key authentication. 

- S3: Mutual agreement of session key between M and B. This is because 
the key is derived by randomly chosen numbers xm and by M and B, 
respectively. 

- S4: Mutual assurance of key freshness. This is due to the fact that the 
session key is derived by randomly chosen number tm and by M and B, 
respectively. Accordingly, both entities ensure the freshness of the session 
key. 

- S5: Anonymity of mobile user. Since any information on M is encrypted on 
the protocol, anonymity is satisfied. 

- S6: non-repudiation of user. This property is based on using the signcryption 
scheme. 

The above properties will be summarized in a table later when compared 
with the properties of the other protocols. 

The following examines attack prevention: 

- Inclusion of RT in the signed part of message: To prevent play-in-the- 
middle attack 

- Inclusion of IDm in the signed part of message: To prevent a parallel 
session attack 

- Inclusion of IDs in the signed part of message: To consider the pay- 
ment protocol [5] 

- Inclusion of rs in the signed part of message: To preclude a time- 
memory trade-off attack [5] 

- Inclusion of random numbers xm and xb in the keying data: To 

prevent a replay of an old key 

The requirement problems R1 to R3, which only occur in a mobile commu- 
nication environment, are evaluated as follows: 

- Rl: Since the protocol has a 2-pass scheme, it has the minimum number of 
message exchanges 

- R2: Since the protocol utilizes elliptic curve cryptographic algorithms, it 
can achieve as small a bandwidth as possible without any lack of security as 
compared with the RSA or ElGamal systems. The level of security assumed 
here is equivalent to the strength of ElGamal system having a finite field 
with a 1024 bit-length prime number, as shown Table 2. The length of the 
other parameters not specified in Table 2 are assumed to be the same as 
shown in Table 3. 
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Table 2. Parameter lengths of RSA, Rabin, ElGamal, and ECC systems. 





RSA 

1 n 1= 1024 
e = 2^® -t 1 


Rabin 
1 n 1= 1024 


ElGamal 
1 p 1= 1024 
U 1=160 


ECC 
E{F,) 
|g 1=160 


System parameter 
Public key 
Secret key 
Signature(single) 
Encrypt, (single) 
Certificate (including 
32 bit for identifier) 


1041 

2048 

1024 

1024 

2097 -t Z* 


1024 

1024 

1024 

1024 

2080 -t Z* 


2208 

1024 

160 

1184 

2048 

2240 -t Z* 


481 

161 

160 

320 

321 

513 -t Z* 



Z* : length of the common bits in each certificate type except the identifier, 
public key and signature of a corresponding entity. 



Table 3. Length in bits of some parameters. 



Parameters 


ID 


RT{TS) 


Random number 


Hash output 


Length in bits 


32 


32 


160 


160 



Based on Table 2 and Table 3, we can compute the bandwidth of the protocol 
as shown in Table 4. These are compared with those of the other protocols 
in the next section. 



Table 4. Length in bits of exchanging messages of the protocol. 





M 


M ^ B 


Length in bits 


160 + 513 + Z + 32 


161 -t 32 -t 320 -t 513 -b Z 



- R3: Off-line computation is desirable at the mobile user end due to limited 
performance. In the new protocol, M need to compute two multiplications on 
an EC system for computing Km b and T. These two multiplications can be 
done in off-line state if the base station broadcast its public key information 
regularly and frequently. B then needs a single multiplication for Km b and 
two multiplications for signature verification. All of these multiplications for 
the base station should be done in on-line state. 



4 Comparison 

In this section, the properties of the new protocol are compared with those of 
the PACS and Zheng’s 1.5-move scheme. Flow diagrams of them are included in 
appendix B. 
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4.1 Security Properties 

Based on the description of security properties given in section 2, the security 
features of all three protocols can be summarized as shown in Table 5. 



Table 5. Comparison of security features. 



Security features 


The proposed 


The PACS 


The 1.5-move 


M ^ B 


B ^ M 


M ^ B 


B ^ M 


M ^ B 


B ^ M 


Entity auth. 


Y 


N 


Y 


N 


Y 


N 


Key auth. 


YE 


YI 


YE 


YI 


YE 


YI 


Key agreement 


Y 


Y 










Key freshness 


Y 


Y 


N 


N 


N 


N 


Anonymity 


Y 


N 


Y 


N 


Y 


N 


Non-repu. 


Y 


N 


y2 


N 


N 


N 



Y: satisfied, A^mot satisfied, yif:satisfied explicitly, 17:satisfied 
implicitly, 1: key transport scheme, 2: key revealed once the 
signature is publicly open 



Since the PACS and the 1.5-move protocol use a value related the system time 
instead of a random number in generating a key, they do not satisfy the property 
of key freshness in strict sense, whereas the new proposal does. In addition, the 
new protocol utilizes a key agreement scheme, whereas the PACS and the 1.5- 
move protocol use a key transportation scheme. In general, a key agreement 
scheme can prevent either entity from generating a weakened or deliberate key. 

4.2 Weaknesses 

As shown in Table 5, some of partly satisfied properties, if put together, can 
cause some weaknesses as follows: 

- Key is revealed if the signature is open: The PACS generates a signa- 
ture, and uses itself as a session key. The key can play the role of the signa- 
ture between two involving entities, yet not publicly as a general signature. 
This constraint creates a limitation in a variety of applications, including 
the initialization of the payment mechanism for value-added service. 

- Impersonation of the mobile user by an inside attacker is possi- 

ble: This attack was introduced by Xu and Wang [15] and was applied to 
Beller-Chang-Yacoby protocol [16]. Zheng’s 1.5-move protocol has a severe 
weakness that an inside attacker who knows the secret key of the base station, 
xb, can impersonate the mobile user without the knowledge of the mobile 
user’s secret key, xm- Authenticity of the mobile user is utterly depend on 
the fact that only the one who knows the mobile user’s secret key can cre- 
ate the value, mod p. However, inside attacker can also create this 

value, {Pm • mod p. 

The presented protocol does not have a weakness up to our knowledge. 
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4.3 Bandwidth Efficiency 

The lengths in bits of exchange messages were calculated for each protocol to 
compare their bandwidth efficiency. As small a bandwidth occupation as pos- 
sible is required for a mobile communication environment. By adopting elliptic 
curve cryptographic systems (see Table 2), a high bandwidth efficiency and lower 
computational complexity can be achieved compared with the RSA or ElGamal 
cryptographic systems. It was assumed that each protocol here employed an 
elliptic curve cryptographic system with a generator of the order of 2^®*^. 

Table 6 shows the bandwidth efficiency of each protocol, and the comparisons 
are based on the statistics used in Table 3. The proposed protocol had slightly 
longer exchange messages than the PACS. This is due to the usage of a random 
number and DH key exchange scheme variant, accordingly, the weaknesses of 
the PACS can be solved by adding these components. Since Z can include a 
version number, issuing party name, or some other information, there can be a 
slight difference in total length. The 1.5-move protocol has the shortest exchange 
message but it can not satisfy the non-repudiation of user. If the signature scheme 
were assumed for the 1.5-move protocol, total length of exchange message is 
modified into 1790 -I- 2Z. 



Table 6. Bandwidth efficiency [bits]. 





The proposed 


The PACS 


The 1.5-move 


B ^ M 


705 


545 


545 


M ^ B 


1026 -t ^ 


1090 -t ^ 


925 


Total 


1731 -t 2Z 


1635 -t 2Z 


1470 -t 2Z 



In Table 6, it is assumed that the PACS uses ElCamal encryption of elliptic 
curve cryptographic system (see appendix C.3) instead of Rabin[17] encryption. 
If the PACS protocol uses Rabin encryption as described in [1], the length of the 
exchanging messages is much longer than that in Table 6. 



4.4 Computational Load 

The computational load was calculated in bit multiplications for each protocol. 
It is assumed that each protocol employed the same elliptic curve cryptosystem 
with a generator of the order of 2^®®. 

Most of the computation time was mainly spent on signature generation/ 
verification, public key encryption/decryption, and multiplication for session key 
generation. Accordingly, the computational load related to hashing, symmetric 
key encryption, some additions and scalar multiplication were ignored if they 
were minor load. In addition, the load for certificate verification is also ignored 
as this is common to all protocols. The method of calculating the computational 
load of each associated transformation is described in appendix C. 
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Table 7 summarizes and compares the computational loads for each proto- 
col. The computational load for session key generation in Table 7 considers the 
multiplications included in tm • G, tm • Pb, and xb ■ T. The multiplications 
for generation/ verification of tag (see appendix A) in the 1.5-move protocol are 
included in ’etc.’. 



Table 7. Computational load [bit multiplication] . 





The propsed 


The PACS 


The 1.5-move 




Signature generation 


- 


[36M] 


- 




Public key encryption 


- 


[72M] 


- 


M 


Session key generation 


[72M] 


- 


[72M] 




etc. 


- 


- 


[36M] 




Subtotal 


[72M] 


[108M] 


[108M] 




Public key decryption 


- 


36M 


- 




Session key generation 


36M 


- 


36M 


B 


Signature verification 


72M 


72M 


- 




etc. 


- 


- 


36M 




Subtotal 


108M 


108M 


72M 



[•] : off-line computational load 



The proposed protocol has the least computational load for mobile user. Even 
though reducing the computational load, which can be done in off-line state, can 
not decrease the latency time, it can do the power consumption of the mobile 
user device. 

Considering only the computational load for the base station, the 1.5-move 
protocol has the least one. However, the 1.5-move protocol did not consider the 
non-repudiation of the user at all. If the signature scheme is applied to the 1.5- 
move protocol, the computational loads are modified into [144M] for the mobile 
user and 144M for the base station. 

5 Conclusion 

The new protocol for authentication and key establishment was presented, which 
was a 2-pass public key based scheme designed specifically for mobile communi- 
cations. ^From the viewpoint of security, it was improved by solving the weak- 
nesses of the PACS and the 1.5-move scheme. It also achieves higher bandwidth 
efficiency and less computational load at mobile user device. 
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A Signcryption Scheme 

Zheng and Imai [13] specify signcryption schemes on elliptic curve. A signcryp- 
tion scheme is a cryptographic method that fulfills both the functions of secure 
encryption and digital signature in a logical single step, but with a computa- 
tional load smaller than that required by traditional signature-then-encryption. 
Figure 2 illustrates the signcryption scheme on elliptic curve. We assume the 
same notations as given in Table 1. 
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Signcryption of m 
by M the sender 


Unsigncryption of (c,r, s) 
by B the recipient 


(fci fe) = hash{rM ■ Pb) 


u = s ■ xb mod q 


c = Eki (m) 


(fcillfe) = hash{u ■ Pm + {u ■ r) ■ G) 


r = K Hk 2 {ni, blind-in fo) => c, 


r, s => m = Dki (c) 


Cm , 

s = mod q 

r + XM 


K Hk^im, blind-inf o) = r 



Fig. 2. Implementations of signcryption on elliptic curve. 



where KH means a keyed one-way hash function. The blind-info in the 
computation of r may contain the public keys or public key certificates of both 
A and B. 



B Flows of the PACS and the 1.5-Move Protocols 

The PACS and the 1.5-move protocol are both described using the notation in 
Table 1. 



B.l PACS Protocol 

The PACS protocol [1] is one of the North American PCS standard system. The 
flows of the protocol can be depicted as follows. 

1. M ■. CertB\\RT 

2. M^B-. Encp^{K\\ESN\\TIDM\\RT}\\EK,{CertM} 

where Encpj^ denotes a public key encryption. TIDm and ESN denote a tem- 
poral identity of user and a serial number of user device, respectively. The session 
key is given by AT = SigM{RT\\IDB\\TIDM\\ESN}. 



B.2 1.5-Move Protocol 

Zheng’s 1.5-move protocol [2] was designed for authentication and security for 
mobile computing. The flows of the protocol can be described as follows. 

1. B^ M : CertB\\TS 

2. M ^ B : Cl {= mod p)||c 2 (= mod p) 0 {K\\T S\\C ert M\\tag)) , 

where tag = hash{K\\TS\\CertM\\{PB’^^^’^ modp)) and G(-) is a cryptograph- 
ically strong pseudo-random number generator. 
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C Computational Load in Bit Multiplications 

C.l Single Exponentiation over a Finite Field 

To compute single exponentiation of M‘^ mod n, where n, M ,and d are 160-bit 
numbers. On average the exponentiation requires 160 squarings and 80 multi- 
plications. Accordingly, the number of bit multiplications to compute a single 
exponentiation is as follows. 

number of bit multiplications = (1 -I- 0.5) x 160 x (160 x 160) 

= 6M 



C.2 Elliptic Curve DSA Scheme 

k • P is computed to execute a signature process, where P is a generator point 
of an EC system, and fc is a randomly chosen integer. The lengths in bits of 
P and k are each 160. The multiplication requires 160 doublings and 80 point- 
additions. Single addition on the EC system requires three multiplications and 
one inverse over a finite field of a 160 bit number order. The inverse has a time 
load roughly equivalent to three multiplications over the finite field. Thus, single 
multiplication on EC system is equivalent to 6 multiplications over the finite 
field. In addition, we finally need two more multiplications and one inverse over 
the finite field to have a signature parameter. A single multiplication requires 
160 X 160 = 25AT bit multiplications, yet this is ignored when compared to the 
main computational load. The total number in bit multiplications to generate a 
signature is as follows. 

number of bit multiplications = (160 -I- 80) x 6 x (160 x 160) 

^ 36M 

To verify the signature, multiplication in the form of fc • P is expected two 
times. Thus, the number of bit multiplications for a single signature verification 
is double that of fc • P, i.e. 72M. 

C.3 ElGamal Encryption in EC System [7] 

To encrypt a message m using ElGamal encryption in an EC system, we compute 
P = k ■ G and 5 = m^P , where P = fc • (& • G), fc is a randomly chosen number, 
and G is a generator. The number of bit multiplications to encrypt a message is 
as follows. 

number of bit multiplications = (160 -I- 80) x 6 x (160 x 160) x 2 

= 72M 

Since the decryption process needs single multiplication, it has 36M in bit 
multiplications. 
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Abstract. The paper considers verifiable Shamir secret sharing and 
presents three schemes. The first scheme allows to validate secrets re- 
covered. The second construction adds the cheater identification feature 
also called the share validation capability. The third scheme permits to 
share multiple secrets with secret validation. The constructions are based 
on hashing and for security evaluation, hashing is modelled as a random 
oracle with public description. We discuss an application of verifiable se- 
cret sharing for the design of cryptographic time capsules for time-release 
crypto. 



1 Introduction 

Tompa and Woll in [14] demonstrated how a dishonest participant can recover 
the secret in a Shamir threshold scheme, leaving other active participant with an 
invalid secret. They also suggested a prevention method in which both the share 
and the coordinate compose the secret share of a participant. Since that time, 
a main research effort has been concentrated into a broad area of the validation 
of secret. The results achieved so far are related to secret validation in both 
unconditionally and conditionally secure secret sharing. 

In unconditionally secure secret sharing, there is no share verification to check 
whether or not participants have received their correct shares. Instead, shares 
are verified by the combiner who refuses to accept shares which have not passed 
verification process. If a share submitted by a participant fails verification, then 
the participant is identified as a cheater. Note that even an honest participant 
can be labelled a cheater if the share has been corrupted during transmission 
from the dealer (or if the assumption about honesty of the dealer does not hold) . 
Rabin and Ben-Or [8] gave a solution which allows the combiner to verify shares 
provided by participants by checking whether they satisfy a system of linear 
equations. Carpentieri in [3] improved the above scheme by showing that the 
verification can be done with shorter shares. 
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In conditionally secure secret sharing, two distinct secret verification prob- 
lems have been addressed: noninteractive share verification and secret verifi- 
cation. The first problem relates to a misbehaving dealer or unreliable com- 
munication channel used for share transmission from the dealer to a partic- 
ipant. Feldman [4] gave a solution in which the dealer, for the polynomial 
f{x) = oo-l-aia;-|-. . . + at-ix*~^, broadcasts the exponents g°‘‘ for f = 0 , . . . , t — 1 
where 5 is a primitive element of a cyclic group from GF(q) (q is a large enough 
so the discrete logarithm problem becomes intractable) . Pedersen [7] used a com- 
mitment scheme to obtain verifiable secret sharing. Note that share verification 
can be applied in two cases: (1) when a participant obtains their share from the 
dealer and (2) when a participant submits their share to the combiner. 

There is also a class of publicly verifiable secret sharing (PVSS) schemes 
introduced by Stabler [12] (see also [10]). This class is not very interesting from 
our point of view as the verification of secret recovered can only be done by the 
participants (unless the secret becomes public !). Besides, the proposed PVSS 
schemes are very expensive to set up and to run. 

The work is structured as follows. Limitations of verifiable secret sharing 
are discussed in Section 2. The motivation and the necessary background is 
introduced in Sections 3 and 4, respectively. Secret sharing with validation of 
secrets is studied in Section 5. Section 6 shows how secret sharing with secret 
validation can be upgrade to identify cheaters. Multisecret sharing with secret 
validation is presented in Section 7. Applications of the verifiable secret sharing 
for designing time capsules are considered in Section 8. 



2 Limitations of Proposed Verifiable Secret Sharing 
Schemes 

Unconditionally secure verification of shares suffers from the following draw- 
backs. 

— Information needed for share verification is typically very long. In the Rabin- 
Ben-Or scheme, each participant holds (3n — 2) additional shares. The Car- 
pentieri scheme assigns (2n -|- t — 1) additional shares, where n is the size of 
the group and t is the threshold. 

— The burden of storing information needed for verification is shifted to par- 
ticipants. 

— The secret recovery is performed collectively by all active participants (the 
combiner is distributed). 

— Delegation of share verification and key recovery to a trusted combiner in- 
volves transfer of long verification information. 

In conclusion, unconditionally secure verification is a valid option if the param- 
eters t and n are relatively small. Note that if a threshold scheme is designed for 
a very large group (t > 1000), then this option is not very attractive. 
Conditionally secure verification based on the Pedersen scheme [7] offers some 
benefits: 
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— each participant holds a share of the length twice of the length of the original 
shares, 

— verification information is public so the verification can be done without 
interaction with other participants. 

— verification can be run in two different cases: by participants to check the 
validity of shares obtained from the dealer and, by the combiner to identify 
cheating, 

— the scheme is perfect, i.e. any (t — 1) collaborating participants learn nothing 
about the secret. 

Drawbacks of the Pedersen scheme include the following: 

— to enable each participant to identify cheaters, the combiner must release 
original shares provided by participants (via secure channels) or alterna- 
tively, zero-knowledge proofs can be applied to check whether shares have 
been correct (see [10]). Both solutions are expensive. 

— the length of a single verifying information (exponents) must be as long as 
the modulus q used for computations, 

— the verifying information is long and must be either stored centrally (to save 
on storage) or by each participant (to avoid authenticated communication 
from the central trusted registry at the time of verification). 

3 Motivation 

Verifiable secret sharing addresses three different problems: 

1. verification of secret recovered by the combiner, 

2. verification of shares at the time of setting up the scheme [4,7], 

3. cheater identification - share verification at the secret reconstruction stage 
[3,8]. 

Note that problems (2) and (3) are identical. The only difference is who is per- 
forming the verification. Note that verification of shares obtained from the dealer 
can be necessary in the two cases: (a) when the dealer misbehaves and (b) when 
the communication channels from dealer to participant are unreliable. 

Our goal is to address the following two problems: 

Problem 1: verification of secret recovered by the combiner. 

Problem 2: cheater identification (share verification). 

To make implementation efficient and fast, we use a collision-free hash function 
with public description. The use of hashing has the following advantages: 

— selection of probabilistic parameters necessary in secret sharing can be done 
with the assistance of hash functions, 

— security properties can be proved using the random oracle model, 

— the length of shares do not need to be larger than the hashing block size 
(n.b. a typical hashing block size is 10 times shorter than the block size in 
the Pedersen scheme). 

— computation of verification information using hashing is much faster then 
equivalent computations using exponentiation. 
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4 Building Blocks 

Given a group of participants V = {Pi, . . . , P„}. Assume that the group wish 
to collectively own a secret s in such a way that only a big enough subset of 
V can recover the secret. The definition of a big enough subset can be given 
by the enumeration of all smallest subsets which are still able to see the secret. 
This collection together with all possible supersets is called the access structure 
r. Threshold (t, n) scheme have a very simple access structure. It consists of all 
subgroups whose cardinality is equal or larger than t or 

r = {A\^A>t}, 

where ^A stands for cardinality of A C V. 

Secret sharing is perfect if any subgroup A which does not belong to the 
access structure can learn nothing about the secret. More precisely, the entropy 
of secret is undiminished for any subset A ^ F or simply 

H{S) =H{S\A) 

Clearly, H{S\A) = 0 for any A G F. We define a weaker notion of perfectness 
which is useful in the conditionally secure setting. 

Definition 1. A secret sharing is computationally perfect if finding the secret 
involves the exhaustive search of the whole space from which the secret has been 
chosen (so if the space contains 2^ elements, an average search takes steps). 

Secret sharing is a pair of two algorithms: the dealer T> and the combiner 
C. The dealer sets up the scheme for a given or randomly chosen secret with 
requested security parameter defined by the size of shares. The combiner is 
activated at the reconstruction stage. Collaborating participants submit their 
shares to the combiner who recovers the secret if the currently active set of 
participants belongs to the access structure. Otherwise, the combiner fails. 

The Shamir (t,n) secret sharing is a collection of the dealer and combiner 
[11]. Computations are done in GF{q) and q specifies the size of the set from 
which shares and secrets are drawn. The dealer takes a secret s € GF{q) and 
the threshold parameter t. For them, the dealer chooses a polynomial 



f{x) = s + aix + . . . + at-ix* ^ 



for random elements a* G GF{q) for t = 1, . . . , t — 1. The shares are Si = f{xi) 
for i = 1, ... ,n. The dealer sends Si to Pi via a secure channel while xi together 
with their assignment to participants are made public. 

At the reconstruction stage, if t or more participants submit their shares to 
the combiner, then it applies the Lagrange interpolation to recover the polyno- 
mial f{x) and the secret s = /(O). If the number of participants is smaller than 
t, then the interpolation will produce a different polynomial f{x) yf f{x) and 
the combiner will recover invalid secret with an overwhelming probability. 




Verifiable Secret Sharing and Time Capsules 173 



For security evaluation, we need a mathematical model of hashing. We use 
the random oracle model [1]. The hash function in this model is a function 

H : S* ^ 

which takes an input message of arbitrary length m G S* and assigns a fc-bit 
output. For a given message m, the fc-bit output is selected randomly, indepen- 
dently and uniformly from the set E^ of all fc-bit values. The hash function H 
is publicly accessible. 

Lemma 1. Given a hash function in the oracle model, a subset S C E* and 
a randomly chosen element s G S where ffS = q and q << 2^. Then the 
knowledge of u = H{s) allows to identify s from S after searching through q/2 
possible values of S, on the average. 

Instead of a formal proof, let us notice that this lemma represents a typical 
problem of searching through a sequence of unsorted entries with random num- 
bers. The assumption that q « 2^ “guarantees” that the value u is unique in 
the set S (with a very high probability) . In this case, it is known that to identify 
a single and unique entry amongst a sequence of q unsorted entries takes q/2 
searches, on the average. 

Hashing in the random oracle model is also subject to the generic birthday 
attack which follows the well-known birthday paradox. For details see [6] . 

5 Secret Sharing with the Validation of Secrets Recovered 

Given a (t, n) Shamir secret sharing based on a polynomial f{x) = s + aix + 
... + at-ix*~^ in GF(q). There is also a publicly accessible hashing function 
H : E* E^ which for an arbitrary message m G E* returns a fc-bit hash value 
H{m). 

The Dealer 

Given collection of participants V = {Pi, . . . , P„} and the threshold parameter 
t, the dealer 

1. chooses the secret s Gr GF{q) at random and computes co-efficients of 
polynomial f{x) according to the following 

Oi = H{s,i,V,t,T) 

where s is the secret to be shared by the group V, i is integer indicating the 
coefficient index, V specifies the membership of the group, t the threshold 
parameter and T is a timestamp. We assume that there is no pair of secret 
sharing with the same timestamp, 

2. finds shares 

Si = f(xi) (mod q) 

3. distributed shares via confidential channel to Pp, i = 1, . . . , n. The values Xi 
are public. 
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The Combiner 

At the reconstruction stage, the combiner collects shares from participants. If 
the number of shares is equal to t, the combiner reconstructs the polynomial 
f{x) using the Lagrange interpolation and computes s. To verify the recovered 
secret, the combiner re-computes coefficients hi from the public hash algorithm, 

i.e. 

Qi = 

If the checks hold for alH = 1, . . . , t — 1, the secret is accepted and sent (via 
confidential channels) to all active participants. They can repeat the verification 
process. 

Theorem 1. The secret sharing with validation of secret is computationally per- 
fect in the random oracle model. 

Proof. In the assumed model of hashing, all random variables assigned to mes- 
sages are independent as long as messages are different. Note that co-efficients Oj 
are, therefore, independent random variables as each message m = (s, i, V , t, T) 
differs from each other on the co-ordinate index i. Further, we suppose that 
there is {t — 1) participants, say Pi, . . .,Pt_i, who want to recover the secret 
and pool their shares together. They can ensemble the following system of linear 
equations: 



5i — 5 




Xi • 
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and express coefficients Oi as 
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The above equation has as many solutions as there are different s G GF{q). 
As the secret is selected randomly and uniformly from GF{q), it means that 
all values are equally probable. The scheme, however, allows them to checks if 
a currently selected secret is correct by trying to re-generate the coefficients of 
fix). 

Now we can create a subset S C S* such that each element is of the form 
(a, i, V, t, T), where a G GF{q). Next we are going to identify {t — 1) messages 
/3i, . . . , (it-i G S such that = m. Using Lemma 1 we can conclude that it 

will take q/2 on the average to find a unique sequence of matching /3j. 

The scheme allows each active participant to confirm that the recovered 
secret is valid. If, however, the recovered secret is invalid then the scheme does 
not provide any facilities to identify cheaters. Note that a single cheater can 
recover the valid secret (and verify its validity) if he applies the method described 
by Tompa and Woll in [14]. Note further that honest participants are able to 
discover the fact that they have been cheated. 
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6 Secret Sharing with Cheater Identification 

Verifiable secret sharing schemes allow for cheater detection or identification 
either at the same time when shares are pooled or later. This is certainly the 
case in unconditionally secure secret sharing when participants “simultaneously” 
reveal their shares permitting the active participants to verify their correctness. 
Consider the conditionally secure secret sharing. Feldman [4] makes public 
so this allows to identify cheaters after they expose their shares as every body 
can check whether 

gfi-i) I gS. 

where Si is the share provided by Pi. Similar comments can be made about the 
Pedersen scheme [7]. 

In other words to identify cheaters, the combiner must know their shares. 
This has some very profound implications on the security of secret sharing. The 
combiner, after validation of shares, may not be able to compute the secret. 
This happens when t participants are active but at least one of them cheats. 
The combiner may either 

— wait until the number of honest participants reaches t or 

— forget the shares and abort the recovery of secret. 

The first solution makes sense if there are many other participants who are ready 
to pool their shares on request from the combiner. An enemy who would like 
to access the secret controlled by secret sharing, certainly sees the combiner as 
an attractive target (instead of attacking t different participants, it is enough to 
break security of the combiner) . The longer the combiner “lives” the higher risk 
of a successful attack. If the waiting time is too long, then active participants 
may decide to terminate the combiner. This obviously increases complexity of 
secret sharing as there must be an additional abort operation. This operation is 
used whenever the combiner detected cheating and the number of valid shares 
is not enough to reveal the secret. 

So we have arrived at the second option which allows to abort the secret 
recovery by termination of the combiner. This could be done “safely” if the 
combiner is trusted in the sense that it does not disclose any valid shares to 
unauthorised persons and is memoryless so after its termination all secret ele- 
ments are deleted. 

Below we present a scheme which allows the combiner to: 

— identify cheaters without revealing shares by active participants, 

— abort key recovery before participants pool their shares to recover the secret. 
This occurs only when the number of honest participants is smaller than the 
threshold parameter t. 

— repeat key recovery after abortion (this may be done by the same or a dif- 
ferent combiner). 

The scheme holds valid shares for the smallest possible time interval allowing the 
combiner to recover the secret only if there are t active and honest participants. 
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The Dealer 

Given collection of participants V = {Pi, . . - ,Pn}, the threshold parameter t, 
and the number of permitted abortions £, the dealer 

— selects co-efficients of polynomial f{x) according to the following 

Oi = H{s,i,V,t,T) 

creates the polynomial f{x) = s -I- a\x . -I- at~ix^~^ and finds shares 
Si = f{Xi), 

— assumes that Cj^o = {xi, Si) is concatenation of Xi and Si, and computes check 
values 

Cij = H{cij-i,Pi,j - l,V,t,T) 

for i = 1 , . . . , n and j = 1, . . . , 

— computes a polynomial 

n 

G{x) = 

— distributes shares and verification information via secure channel, i.e. Pi gets 

Si and {Cijij = 0, 1}. 

This information is known to Pi only; i = 1, . . . , n, 

— publishes the polynomial G{x). 

Share Validation 

Assume that at the pooling time, there are v active participants, say Pi, . . . , Py. 
Each active participant Pf. 

— announces Ci^i-i by displaying the data on a public billboard (broadcasting) 
in the form of a signed message, 

— if all check values are displayed on the billboard, participants compute hash 

values {ci^i = H{ci^i-i, Pi,£ — = l,...,u} and check whether 

G(ci,r) = 0 for all active participants. If there are t participants out of v 
active ones {t < v) whose check values have passed verification, then they 
call the combiner. Otherwise, the attempt is aborted. 

Clearly, any failed attempt for secret recovery, causes that participants reveal 
part of their check values or in other words, they have disclosed pre-images of 
the hash values. 

It is a security policy matter to decide what needs to be done when there 
is a group of dishonest participants. If the number of cheaters is small then the 
recovery of secret can go ahead if the number of honest participants is at least 
t. In general, one would expect that if a majority of participants has passed the 
share validation and it contains at least t members, then they will go for recovery 
of the secret. 

The Combiner 

The combiner is activated only if there is a group of t or more honest participants. 
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— The combiner asks the active and honest participants to submit their shares. 

— Participants submit their shares. The combiner repeats computations of hash 
values and verifies whether the last check values are zeros of G{x). This part 
can be substantially shortened if the combiner can access transcript of the 
public discussion carried out during share validation. 

— Shares provided by cheaters are disregarded and if there is at least t shares, 
the combiner recovers the secret using the Lagrange polynomial interpola- 
tion. 

— The combiner validates the recovered secret (see the secret sharing from the 
previous Section). 

— Distributes it via secure channel to all active participants who may repeat 
the whole verification process. 

Note that share validation can detect opponents who impersonate participants. 
A successful completion of the share validation does not guarantee that the 
active participants will not try to cheat at the recovery stage. It, however, re- 
assures that all active participants hold their valid shares so the combiner will 
be able to recover the secret (perhaps after a “gentle” warning for misbehaving 
participants) . 

Consider the following attacks on the system. 

Recovery of check information and shares. This attack tries to reverse 
the hash function. To be successful the attacker must perform, on the aver- 
age, « 2^'“^ steps if the hash function outputs fc-bit digests. Clearly, after 
£ — 1 abortions, the shares can be computed with the same computational 
overhead. Note that the public polynomial G(x) gives out n digests expected 
to appear when the valid check values are produced. In the random oracle 
model, an attacker is able to find a value a such that G{a) = 0 after « 
steps. The polynomial G{x) together with the hash function is a form of the 
well-known sibling intractable hash function. The security evaluation of such 
construction can be found in [16]. 

— Cheating by the participants. A participant Pi can try to find a colliding 
message Si for his valid share Si such that 

— L7(s^) — Hi^Si). 

This will cost Pi on the average « 2^/^ steps. Note that the cheating will 
not be detected by anybody except by the combiner (with the probability 
1 — 2“^) at the validation of secret stage. If the combiner is implemented as 
a distributed system and the cheater can see shares of all active participants, 
then he may apply (successfully) the Tompa-Woll attack. If, however, the 
combiner is isolated from active participants and does not return secrets 
which have not passed the validation stage, then the secret is lost assuming 
the abortion is not an option. 

— Collusion of t— \ participants. This attack has been discussed in previous 
Section. The scheme is computationally perfect - to find the secret it is 
necessary to exhaustively search 2^ possible values. 
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It can be argued that participants may have limited computing resources so 
multiple computation of hashing values can substantially slow down the secret 
recovery process. This is the case when participants are mobile agents who keep 
their shares on smart cards. It is possible to modify the secret sharing with 
cheater identification in such a way that check values are computed in “parallel” 
instead of in “sequence” . 

The dealer can be modified by calculating 

Cij = H{xi,Si,j, IDp.,V,t,T) 

for j = Further, the dealer computes n distinct polynomials - each for 

every participant. So Pi is assigned Gi{x) such that 

e 

Gi{x) = - ttij) 

j=i 

where aij = IDp^,V Polynomials Gi{x) are public. 



7 Multisecret Sharing with Validation 

Given ^ secrets say (ki,...,k^) which are supposed to be shared among the 
same group V of participants. Participants may have limited storage resources 
and they require to hold as few shares as possible. So the following problem 
arises. How to design a family of ^ secret sharing schemes each of which allowing 
to recover a single secret assuming that any participant holds a single share only. 
The solution is based on the work [13] and is described for the case when ^ = 2 
but can be easily extended for an arbitrary 
The Dealer 

Given collection of participants V = {Pi, . . . , P„} and two secrets (ki, K 2 ) which 
are supposed to be shared by V using two threshold schemes (G, n) and (^ 2 , n). 
Two cryptographically strong collision-free hash functions Hi and H 2 with public 
description are given. The dealer 

1. selects basic shares Sj for Pj at random from the large enough Galois field 
GF(q), 

2. assigns a public Xj for each Pj, 

3. for each basic share Sj, computes two shares 

= Hi{Sj,V, ti, T) and = H 2 {S,,V, h, T) 

4. applies the Lagrange approximation and finds the two polynomials 

fi{x) = Ki + a^i^x -I- ... -I- 

which passes through the points (0, Kj), (ccj, and additionally = 

Hi{Ki,V,ti,T); t = 1,2, 
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5. sends via private channels basic shares to participants and broadcasts the 
coefficients , a^^+i- 

The Combiner 

At the reconstruction stage, ti active participants pool their shares together. 
Knowing the co-efficients , a^^+i points, the combiner can recover 

the polynomial fi{x) and the secret 

Ki = /i(0) 

and validate the secret by checking 

Note that this time validation of the secret recovered is done by the combiner. 
To allow participants to validate the secret, the combiner must provide the 
polynomial fi{x) to Pj so Pj can check her share fi{xj) = and the value 

8 Time Capsules 

Rivest, Shamir and Wagner [9] consider time-lock puzzles which can be used to 
control the delay of execution of cryptographic operations (also see [5]). Appli- 
cations of time-release crypto for key escrowing can be found in [2,15]. There are 
two general approaches to the design of time capsules: 

— algorithmic - the delay is measured by the time-complexity of a well-under- 
stood numerical problem, 

— probabilistic - the delay is measured by the number of steps (typically in 
the exhaustive search) necessary to discover the secret. 

The algorithmic approach suffers from an obvious drawback that the time- 
complexity of majority of “cryptographically” useful problems is not known. 
This typically tends to “shorten” the expected delay as our knowledge about 
algorithms and our computing technology progresses. On the other hand, prob- 
abilistic time capsules (timers) suffer from the inherited probabilistic nature 
which determines the delay in terms of the average rather than a precise value. 
Additionally, probabilistic timing can be easily run in parallel shortening it pro- 
portionally to the number of timer copies executed concurrently. Their strong 
point, however, is that time measure does not depend on the progress in Theory 
of Algorithms. 

8.1 Time Capsules with a Single Timekeeper 

Rivest et al. based their time capsules on repeated squaring [9] which is believed 
to be inherently sequential. Hashing is also inherently sequential and although 
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is much faster than squaring, the delay can be controlled by the size of the 
message to be hashed. Assume that there is a collision- free hash function H 
whose implementation provides v hashings per second. Suppose further that the 
requested delay for a secret s recovery is r seconds. The number of necessary 
hashing operations to introduce the delay is tv. The delay can be implemented 
in different ways which are equivalent in the sense that all must used hashing 
sequentially tv times. Consider the following two cases: 

— s = H{. . . H{r) . . .) - this is sequential hashing of the initial vector r, 

TV 

— s = H{r, . . . ,r) - the initial vector r is used to create a sufficiently long 

TV 

message so its hashing consumes tv operations, 

where s is the secret whose recovery must be delayed. The time capsule uses 
secret sharing with the validation of the secret and is a collection of the following 
algorithms: 

— the dealer who sets up the capsule for the requested parameters. Those 
parameters include the number of agents who are authorised to switch on 
the capsule (to count down the delay), the threshold used to activate the 
capsule, and the requested delay, 

— the timekeeper who collects shares from agents and having her own shares 
is able to recover the initial vector r and after the delay r recover the secret 
s. 

Consider a simple case when the time capsule is controlled by a single agent Pi 
and a timekeeper P 2 - 

Dealer 

Given a collection of participants V = {Pi, P 2 }, the threshold parameter t = 2, 
and the delay r, the dealer 

1. chooses the initial vector r Gr GF{q) at random and “winds up” the timer 
by finding the secret 

s = H{r, ...,r) 



2. computes oi = H{s,l,V ,t,T), and finds the polynomial f{x) = r + aix 
together with two shares si = f{xi) and S 2 = f{x 2 ), 

3. distributes the shares to participants so Pi knows her secret share Si while 
Xi is public (i = 1,2). 

Timekeeper 

The time capsule is activated by the agent Pi who gives her share to the time- 
keeper P 2 - Knowing the two shares, P 2 can recover the polynomial and the initial 
vector r = /(O) and start computing the secret which will be recovered when 

ai = H{s, 1, V, t, T) where s is the current hash value after i < tv hashing op- 
erations. This condition can be checked concurrently so it will not influence the 
delay r. 
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Note that winding up the timer to the requested delay takes the same amount 
of time as the delay. This is clearly a problem especially when delays are to be 
longer (days, weeks, months). A simple solution to this problem is the use of 
I concurrent winding threads. The first thread starts hashing from the initial 
vector r\ = r while others start from random vectors ri GF(q) (i = 2 , . . . 
After a precise number of tv iterations, outputs are collected from the threads - 
let them be hi, hi. Now the parallel threads are “glued” together to create 
a single hashing stream which will delay the timekeeper by £t. To do this, the 
dealer computes 

9i = hi® n+i 

for i = 1 , 1 and makes them public. Note that the secret s = hi. To find 

out the secret s, the timekeeper must start from ri = r and continue hashing 
the prescribed number of times tv. Then she collects the output h\, recovers 
T2 = 92®hi and keeps hashing again tv times, getting /12, recovering = 530/12, 
etc. Finally, she gets hi = s. 

The major criterion for security evaluation seems to be the stability of the 
delay introduced by the capsule. There are two kinds of factors which influence 
the intended delay: 

1 . progress in technology - a faster (hardware, software) implementation of 
cryptographic primitives used in the time capsules, 

2 . advancement in cryptanalysis of cryptographic primitives used in the time 
capsules. 

The first factor always shortens the delay. This may force the designers to create 
time capsules with grossly overestimated delay so when the timer is activated, 
the delay will be far too high than requested. This can be solved by designing a 
time capsule with multiple points of entry into the hashing sequence (controlled 
by agents). 

The second factor relates to two aspects: collision freeness of our underlying 
hash function and the structure of the timer. It is easy to see that sequential 
execution of hashing can be circumvent if there is a collision in the sequence 
traversed by the timer. 



8.2 Time Capsules with Multiple Timekeepers 

It seems to be difficult (if not impossible) to design a stable time capsule with 
an arbitrary long delay. It is however possible to design a system in which mul- 
tiple timekeepers are used and they can collectively recover the secret if a big 
enough collection has successfully completed their computations. The delay in 
recovery of the secret will be enforced by the slowest timekeeper in the group. 
Additionally, the secret will be recovered collectively so if the delay is shorter 
than assumed, all participants will be equally advantaged or disadvantaged. 

Given a group of timekeepers V = {Pi, . . . , P„} and an agent A whose re- 
sponsibility is to activate a time capsule. For this purpose, the agent keeps shares 
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for each timekeeper so when the time comes, the agent broadcasts the shares 
and timekeepers can start counting down. 

Dealer 

The dealer sets up the whole system. The input parameters are the secret s and 
the requested delay. 

1. First he designs n (2,2) Shamir schemes which will be used by the pair 
{A, Pi) to collectively hold the initial vector r*; t = 1, . . . , n. Let the schemes 
be based on polynomials fi{x) = r* + aiX where 

tti = H{s,Pi,T) 

s is the secret and T is a timestamp. The dealer privately sends fAxp,) to 
P, and MX a) to A. 

2. The dealer winds up the timer by computing 

hi — PlriXi) 

where Plr is a collision free hash function with the delay r. 

3. Next he designs a (t, n) Shamir scheme which allows any t timekeepers to 
recover the secret s. To do this, he takes n + 1 points: {xi,hi)'i * = 1, • • • , u, 
the point (0, s), and finds a polynomial 

g{x) = s + bix + . . . + bnx"'. 

4. Co-ordinates Xi and co-efficients {bt , . . . , bn) are made public. 

Agent 

The agent activates the timekeepers by broadcasting shares M^a) to timekeep- 
ers. 

Timekeepers 

There are n independent timekeepers. 

1. Each Pi takes the shares M^a) and his own fi{xp^), recovers the initial 
vector Xi and computes his hi (after the delay r). 

2. Now a group of t active timekeepers collectively recover the secret s. 
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Abstract. In a threshold RSA signature scheme, dishonest participants 
can disrupt signature generation by submitting junk instead of their 
partial signatures. A threshold signature system is robust if it allows 
generation of correct signatures for a group of t honest participants, and 
in the presence of malicious participants. The purpose of this paper is 
two-fold. First we show that a robust {t, n) threshold RSA signature 
scheme, proposed by Rabin in Crypto’98, lacks an essential property 
of (t, n) threshold schemes and allows an adversary to forge signatures. 

Then we propose a new approach to the construction of t-robust (t, n) 
threshold RSA signature scheme which can be seen as the dual to Rabin’s 
approach. We discuss the efficiency of our system and show that when t 
is small (compared to n) our scheme is much more efficient than other 
existing schemes. 

1 Introduction 

Threshold cryptography, and in particular threshold signature, was indepen- 
dently invented by Desmedt [13], Boyd [9], Croft and Harris [12]. The main goal 
of threshold cryptography is to replace a system entity - such as a transmitter - 
in a classical cryptosystem with a group of entities sharing the same power. A 
threshold cryptosystem must remain secure not only under the attacks on the 
original cryptosystem, but also new types of attacks that are introduced because 
of the distributed structure of the system. 

In a (t, n) threshold signature scheme [18], signature generation requires col- 
laboration of at least t members of a set of n participants. Although construction 
of threshold signature schemes generally uses a combination of secret sharing 
schemes and signature schemes, as noted in [14], a simplistic combination of the 
two primitives could result in a completely insecure systems that allows the mem- 
bers of an authorized group to recover the secret key of the signature scheme. In 
a secure threshold signature scheme the power of signature generation must be 
shared among n participants in such a way that t participants can collaborate to 
produce a valid signature for any given message whilst no subset of fewer than t 
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participants can forge a signature even if many signatures on different messages 
are known. 

A major problem in the construction of threshold RSA signature schemes 
is that the secret exponent which must be shared among the participant, is an 
element of which is an Abelian group, and not a field. This means that the 
majority of classical secret sharing schemes, such as Shamir’s scheme, cannot be 
used. The first simple and elegant solution to distributed RSA signature is due 
to Boyd [9] and Frankel [21] and gives the share of the signature key d (the RSA 
exponent) to n signers Pi, . . . , P„ such that Pi holds di and d = d\ + ■ ■ ■ + dn- 
To sign a message m, each signer Pi produces a partial signature rnf'^ which is 
combined (multiplied) as mf^ = ■ . . .•m'’*” to create the signature on m. This is 

an (n, n) scheme and requires collaboration of every single member of the group 
for generating a signature. This can be seen as a drawback which drastically 
reduces availability of the system and in particular in cases where trust structure 
in the group permits signing even if t out of n group members collaborate. In 
order to implement a system with a threshold t, t < n, one can generalize the 
above (n, n) scheme using cumulative secret sharing scheme developed in [32], 
or adopting a protocol such as the one below. The dealer generates the shares 
of (") independent runs of a (t, t) additive secret sharing scheme for the same 
secret, the signing key d in this case, and gives appropriate shares to each signer. 
Now any t subset of the group has the complete set of shares for one run of the 
secret sharing scheme and can sign a message. The main drawback of schemes 
such as this is their inefficiency in the sense that each signer has to store shares 
which is in total (") times of the size of RSA signing key. 

Desmedt and Frankel [18] initiated the study of efficient threshold RSA sig- 
nature and gave a heuristic solution for it. The basic idea is generalizing Shamir’s 
polynomial scheme over by extending Lagrange polynomial interpolation 

over a finite field to a module over a ring, and then using it for the signa- 
ture generation. The resulting scheme requires each participant to have a share 
whose size is n times the size of the secret, in this case RSA secret exponent. 
Another elegant approach which was implicitly proposed by Blackburn et al [6] 
achieves the threshold t by utilizing an appropriate perfect hash family to com- 
bine independent runs of a (t, m) RSA signature scheme where n > m. When 
t is small compared to n, this approach is much more efficient than Desmedt- 
Frankel scheme and in its optimal form can reduce the size of each singer’s key 
to O(logn) times RSA signing key, which is much less that n times RSA signing 
key required by the Desmedt-Frankel. 

In this paper we focus on another aspect of threshold signature systems: ro- 
bustness against dishonesty of participants. In shared generation of signature, 
dishonest participants may submit some junk instead of their partial signatures. 
A (t, n) threshold RSA signature scheme is called robust if it can correctly com- 
pute the signature even in the presence of up to t— 1 arbitrary malicious signers. 
Robust threshold signature schemes have very important applications. Although 
distributing signature generation process effectively distributes responsibility of 
a trusted node among n local nodes, and hence removes system’s bottleneck. 
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there is always the danger of reduced availability because a faulty node can eas- 
ily disrupt the system. Adding robustness ensures that distribution of trust is 
not at the expense of reduced availability. In a robust scheme partial signatures 
are verifiable and so combiner only uses pick correct compounds (i.e. partial 
signatures) into the computation; see also [25,23]. 

Desmedt pointed out [15] that because an RSA signature can be publicly 
verified, it is possible to detect cheaters if more than t partial signatures are 
used and the number of cheaters is £, where 0 < £ < t — 1. The basic idea is to 
use £ + t partial signatures, instead of t, and note that among all subsets 
of size t of the partial signatures, there is at least 1 subset with t correct partial 
signature that results in a signature that can be correctly verified. It is easy to 
see that the main condition for this construction to work is that £ + t < n or 

LtJ- 

Again the main drawback of the above solution is its inefficiency: for £ ma- 
licious users, in the worst case signature must be generated and verified 
and so the cost of such a system is prohibitive. 

Gennaro et al [25] , and independently Frankel et al [23] , initiated the study of 
robust RSA signature schemes. Gennaro et al developed two methods to achieve 
robustness in such schemes. The first solution is an interactive protocol that uses 
zero-knowledge proof systems of [26] and the second one is a non-interactive 
scheme that is based on a novel technique of IGP (Information Ghecking Proto- 
col) to verify the integrity of the partial signature. 

Another approach to the construction of efficient robust threshold RSA sig- 
nature scheme due to Frankel et al [23] is to extend the notion of result- checking 
due to Blum et al [7] to witness-based cryptographic checking. The work in [23] 
provides a more general theoretical framework, and also applies to RSA as a 
specialization. 

Rabin [30] further studied the robustness in threshold RSA signature schemes 
which also provides the proactivity. 

Rabin’s (t, n) robust threshold RSA signature scheme [30] is simple and has 
low memory and computation cost, and can be viewed as a “ramp scheme” 
where the degradation in the threshold is allowed. Its simplicity is due to this 
relaxation which may allow every honest shareholder after t — 1 bad ones are 
exposed, to generate a signature on its own. However as we will show in section 
3, the scheme is not (t, n) threshold and requires collaboration of n participants 
((n, n) threshold) for generation of a signature. This followed by a new scheme 
which is both (t, n) threshold and robust against up to t—1 malicious adversaries. 
This construction is motivated by Blackburn et al’s work [6], and builds a (t, n) 
threshold scheme from multiple runs of a (t, t) scheme, with cheater detection 
property, combined by using a perfect hash family. The scheme is particularly 
efficient for large groups with small threshold. 

The paper is organized as follows. In Section 2 we describe the model. In 
Section 3 we review Rabin’s robust threshold RSA signature scheme and show 
an attack that allows generation of valid signatures by a malicious participant. 
In Section 4 we present a (t,t) RSA signature with cheater detection and then 
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propose our {t, n) robust threshold RSA signature scheme in Section 5. The 
paper is concluded in Section 6. 

2 Preliminaries 

The Communication Model. We will follow the model of [25] and [30]. We 
assume that the system consists of a set of n participants {Pi, . . .,P„}. They 
are connected by a complete network of private (preserving secrecy) point-to- 
point channels. In addition, they have access to a dedicated broadcast channel; 
by dedicated we mean that if Pi broadcasts a message, it will be recognized by 
other participants as coming from Pi. 

The Adversary. We assume that an adversary can corrupt up to t — 1 of the 
n participants in the network. We consider the worse possible kind of adversary, 
i.e. a malicious adversary that learns all the information held by the corrupted 
participants and hears the broadcasted messages. He can cause corrupted players 
to behave in any possible malicious way. 

The Dealer. In order to focus on the high-level description of the protocols, 
we further assume that there is a dealer, who sets up the keys. This includes 
generation of RSA key, and generation and distribution of shares to participants. 
The trusted dealer can be eliminated by a distributed key generation process of 
Boneh and Franklin [8]. 

Notation. For a positive integer k we denote [fc] = (1, . . . , fc}. The public mod- 
ulus is denoted by N. We assume N = pq, and p, q are safe primes. That is 
p = 2p' -I- 1,9 = 2q' + 1, where p < q and p,q,p',q' are prime numbers. We 
denote </>(A) = (p — l)(q — 1) and d G [</>(iV)] the secret key of RSA. 

3 Rabin Robust Threshold Scheme and Its Weakness 

In Crypto 98, Rabin [30] suggested an approach to achieving robustness in {t, n) 
threshold RSA signature schemes through the usage of share back-ups. Rabin’s 
scheme [30] is t-robust for t < which means that it can correctly compute 
signatures even in the presence of up to t — 1 malicious (corrupted) participants. 
However we will show that it is not really {t, n) threshold in the sense that it 
actually requires collaboration of all n participants for generating a signature. 
If one of the group members is unavailable, the only way of generating the 
signature is by reconstructing the absent participant’s partial key, followed by a 
key proactivisation to give new keys to participants. We will show that allowing 
partial keys to be reconstructed leads to an attack that can effectively reveal the 
secret key of the RSA. The attack described in section 3.2. 



3.1 Rabin’s Scheme 

Rabin [30] continued the work of Gennaro et al [25] and proposed a simplified 
approach to threshold robust RSA which reduces memory and computation cost 
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of the system. In this approach an (n, n) RSA signature scheme in the addi- 
tive form where each participant holds a secret input (his partial key) is used 
to construct the signature. Each participant’s input is then shared among all 
other participants using a verifiable secret sharing scheme and forms the back- 
up information. In the event that one or more (up to t — 1) participants fail to 
cooperate, or are faulty, then their information can be reconstructed from the 
‘back-up’ copy and incorporated into the computation. 

A brief description of Rabin’s scheme follows. 

Key Generation. In the key generation phase, the dealer chooses the RSA 
secret key d G Z 0 (tv) and performs the following steps: 

1. Chooses and securely gives to Pi his secret signature generation key 

di Gr • • • , niV^], for 1 < t < n, and sets dpubUc = d- di. 

2. Computes witnesses Wi, given below, and broadcasts them 

Wi = g‘^' (mod N), 1 < i < n, 

3. Uses a {t,n) verifiable secret sharing similar to [20,22] and described in 
[30], to share di among the n participants. We follow Rabin and refer to 
this protocol as {t,n) Feldman-ZAr-VSSS. 

Signature Generation. To sign a message m, participants carry out the fol- 
lowing steps: 

1. Participant Pi publishes his partial signature ai = mf’' (mod N). 

2. The signature is calculated as nr=i (mod N). If this signa- 

ture can be correctly verified using the public key, the process finishes. 

3. If an error is detected in the signature, each participant Pi must prove the 
correctness of his partial signature using Gennaro et al non-interactive 
partial signature verification protocol [25] . 

4. If Pj’s proof fails, more than t participants reconstruct di using the share 

of Feldman-ZAT-VSSS and obtain ct* = . 

5. Compute the signature SIG{m) = CTi (mod N). 

It is shown that the memory and computation cost of this scheme is less than 
the the protocols of [25] and [23] for robust threshold RSA signature scheme. 
However, although Rabin’s scheme does allow construction of signatures if only 
t participants collaborate but in practice any time that less than n participants 
collaborate partial keys of all other participants are revealed and a proactivi- 
sation phase is required. In the following section it is shown that this could be 
used by an attacker to forge signatures. 

3.2 A Weakness of Rabin’s Scheme 

We first review some basic notions underlying robust (t, n) threshold signature 
schemes. The aim of a (t, n) threshold signature scheme is to allow any subset of 
at least t participants to generate a valid signature for an arbitrary message m. 
Thus, an important feature of threshold schemes is that they increase availability 
of the system as only t active and honest participants are enough to produce a 
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valid signature. Participants who do not take part in the signature generation (for 
m) are called non-active participants (for m). A participant is called corrupted 
if he does not follow the protocol. The scheme is t — robust if it can correctly 
compute the valid signatures even in the presence of up to t — 1 corrupted 
participants. It should be emphasized that in robust threshold signature schemes, 
a non-active participant is not necessarily corrupted. This means that a (t, n) 
threshold signature scheme allows up to n — t non-active participants, while in 
a t-robust scheme, the condition t < must be satisfied as the adversary who 
has corrupted t — 1 participants can perform any malicious action. That is in 
a (t, n) threshold signature scheme which is t-robust, if non-active participants 
are all corrupted, then n — t-\- 1 < [^\ and so t > \^~\ which is a contradiction. 

Next we give a simple attack to show that Rabin’s scheme can be subverted 
to forge signatures. The adversary needs to only corrupt one participant: without 
loss of generality, assume that P\ is the corrupted participant. Pi will perform 
the following steps: 

1. Pi asks a subset {Pi,i G A} of participants such that |AU{1}| = t to 
co-operate to sign a message m. 

2. Pi informs participants Pi,i & A that the participants Pj, j € {1, . . . ,n} \ A 
are non-active (as the scheme is assumed to be (t,n) threshold). 

3. At the same time. Pi starts a parallel session, this time asking {1, . . . , n} \ 
A to co-operate to sign another message m' . This time he will claim that 
participants in A are non-active. We note that |{1, . . . , n} \ A| > t. 

4. At the end of the two parallel runs, Pi has all the partial keys and so can 
construct the secret key d. 

It is worth noting that the two runs of the protocol are in parallel and result 
in the recovery of the secret key and so proactivisation will not be helpful as it 
will only change representation of the key and not its value. 

This means that: 

Rabin’s robust threshold RSA signature [30] scheme is (n, n) threshold and 
t-robust, but not (t, n) threshold. 

The weakness of Rabin’s scheme is due to the fact that during signature 
generation, the keys of (non-active or, crashed) participants are revealed. In an 
RSA based threshold signature, it is important to keep participants’ keys secret 
and only reveal partial signatures. For further discussion on this, see [17]. 

4 (t, t) RSA Signature Scheme with Cheater Detection 

The idea of our approach is to use (t, t) schemes as building blocks in constructing 
(t, n) scheme, a similar approach of using (t, t) systems as building blocks can be 
found in [22] on the subject. To this end, in this section we describe a {t, t) RSA 
signature scheme that allows detection of cheaters which will then be used to 
build a robust (t, n) threshold scheme. The scheme uses Gennaro et al’s partial 
signature verification protocol [25] to verify partial signatures in the additive 
{t, t) threshold RSA signature scheme. It consists of the following two phases. 
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1. Key generation: The dealer who has generated the RSA secret key d G 
[</>(iV)] carries out the following steps: 

— Chooses and secretly gives value di Gr [</>(-/V)] to Pi for 1 < i t such 
that d= d\ + ■ ■ ■ + dt (mod (j){N)). 

— For each pair of participants Pi and Pj, the dealer generates and dis- 
tributes secret information such that Pi and Pj can execute Gennaro 
et al partial signature verification protocol [25] to verify correctness of 
partial signatures At the the end of distribution phase each Pi holds the 
following values. 

(a) His share df, 

(b) Auxiliary authentication values yi^i, . . yi^t, where yij G Z is used 
to prove the correctness of his partial signature to Pj; 

(c) Verification data vi,i, . . Vt,i, where Vj^i = {bj^i, Cj^i) such that bj^i G 

[iV*^!], Cj^i G . Pqj. 2-tuple 

value Vj^i is used to verify correctness of Pjs partial signature. 

We will denote the collection of values held by Pi and containing the values 
in (a), (b) and (c), as Si. 

2. Signatures generation: For each pair Pi and Pj, Pj can detect Pi’s cheat- 
ing in the following way. 

— Pi broadcasts his partial signature ai = (mod N) and the auxiliary 
value V j = (mod N), for j = 1, . . ,,t. 

— Pj accepts Pi as an honest participant if = Yij, and concludes 

that the partial signature of Pi is (mod N) or — (mod iV); 

otherwise detects Pi as a cheater. 

— After performing the verification phase and given that a participant ac- 
cepts t partial signatures, he can generate a signature for m, SIG{m) = 
a\. . .at or —a\ . . .at. Using the public key, it can be easily determined 
which one is the correct signature. 



Theorem 1. ([25]) In the above scheme, the followings hold: 

Completeness: If t participants follow the protocol, then they will always gen- 
erate a correct signature SIG{m) for a message m; 

Soundness: A cheating Pi can convince Pj to accept ai yf (mod N) 

with probability at most y + 

Zero-Knowledge: Up to t — 1 cheating participants do not learn any more 
information about another participant’s key other than his partial signature. 

The proof is straightforward from Theorem 5 in [25] . 

5 A (t, n) Robust Threshold RSA Signature Scheme 

Now we present an efficient solution to threshold RSA signature that also pro- 
vides robustness. In order to achieve threshold, we use the above (1, t) threshold 

RSA signature scheme as the building block and Hi ft’ it to (t,n) threshold by 
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applying a perfect hash family. While Rabin used a (n, n) threshold RSA sig- 
nature scheme as the underlying scheme and then used a (t, n) secret sharing 
scheme to ‘break’ the underlying scheme to achieve (t,n) threshold signature, 
our approach can be considered as the opposite direction of Rabin. Both two 
approaches utilize the simplicity of the additive threshold scheme, and rely on 
Gennaro et al partial signature verification protocol to achieve the robustness. 

Before we describe the details of our scheme, we briefly review some basic 
notions and results on perfect hash families. An (n, m, t)-perfect hash family is 
a set of functions T such that 



/ : {1, . . .,n} — > {1, . . .,m} 

for each f G iF, and for any X C {1, . . . , n} such that |A| = t, there exists at 
least one f & X such that f\x is one-to-one. 

We use the notation PHF{W;n,m,t) for an (n, m, t)-perfect hash family 
with \T\ = W and will write T = {/i, . . . , fw}- When m = t, PHF{W; n, t, t) 
is called a minimal perfect hashing family. Perfect hashing families originally 
arose as part of compiler design - see [28] for a summary of the early results, and 
[11] for a survey of recent results. Perfect hashing families have found numer- 
ous applications in circuit complexity of threshold functions and to the design 
of deterministic analogue to probabilistic algorithms; see [1]. Numerous con- 
structions for perfect hash families using finite geometries, designs theory and 
error-correcting codes are known. 

The connection between perfect hash families and construction of efficient 
threshold secret sharing schemes was implicitly discovered by Blackburn et al in 
[6] . The utilization of perfect hash family for efficient construction of threshold 
signature schemes was later noted in the survey papers [15] and [4]. While [15] 
and [4] are only concerned with providing threshold property, we are also con- 
cerned with robustness. In order to be able to exploit the simple additive (t, f) 
threshold scheme, we will use a particular class of perfect hash families- minimal 
perfect hash family. 

Our (t, n) robust threshold RSA signature schemes consists of two phases: 
Key generation and Signature generation. In the key generation, a dealer D, 
who knows the secret key d of RSA generates and distributes the secret key 
information (shares) to all participants involved in the system. The secret key 
of each participant consists of three parts: (1) the value for generating partial 
signature; (2) the values for proving the correctness of his partial signature 
to other participants; and (3) the value for verifying partial signatures from 
other participants. During signature generation, participants i) generate partial 
signatures; ii) prove correctness of their own signature and verify validity of other 
participants’ partial signatures; and iii) combine the correct partial signatures 
to generate a full signature. 

Another way of interpreting our approach can be described as follows. We 
execute multiple rounds of a (t,t) threshold RSA signature scheme with cheater 
detection, and then apply a perfect hash family to assign secret key to the n 
participants in such a way that every t out of n participants have the complete 
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set of shares for at least a (t,t) scheme and so can sign any message. However 
because of the existence of corrupted participants we need to ensure robustness. 
Our basic approach is to employ the cheater detection capability of the under- 
lying {t, t) scheme to ’filter-out’ cheating participants by successively forming 
groups of size t, executing the {t, t) scheme with cheater detection to detect 
cheaters and substitute them with new participants. This procedure will finish 
when a set of t honest participants, which is known to exist, is found. At this 
stage a correct signature can be generated. In essence the scheme uses the basic 
approach of trying to form a group of t unfaulty players but instead of forming 
all verifying all signatures that can be formed by a set of 2t — 1 partial signatures 
as described before (Section 1), it removes cheaters in the group and substitutes 
them with new participants. In the worst case, it requires t—\ times in perform- 
ing the underlying (t, t) schemes and has be significantly reduced from the trivial 
solution that requires (^*7^) of the underlying (t, t) scheme in the worst 

case. Moreover, when t is small (relative to n), it is known that PHF{W; n, t, t) 
with W = 0(log n) exist, thus the scheme achieves significantly higher storage 
and computation efficiency through the use of perfect hash families. 

We now present our robust (t, n) threshold RSA signature scheme. 

Key Generation. Assume that T = {fi, , fw} is a W{n, t, t) minimal per- 
fect hash family and 0 < 5i,(52 < 1 are two security parameters. The hash 
family and security parameters are both publicly known. Let d G [</>(iV)] 
be the RSA secret key which is chosen by the dealer and kept secret. The 
dealer generates partial keys of W independent runs of the (t, t) RSA sig- 
nature scheme with cheater detection (given in section 5) for the same RSA 
secret key d. That is 



^i = (s},...,si),...,(sr 



VF 






where denotes the key of tth participant in the fcth run of the underlying 
(t, t) RSA signature, for 1 < t t and I < k < W . This can be expressed 
as a t X t array consisting of a single row, the tth row, and a single column, 
the tth, given below. 









Vti 



vti-i 



d\ 



■ ■ 



■ytt 






such that the following conditions are satisfied 

- bl^ G [N% clj G and = d\bl^ 

and all 1 < fc < W; 

- dj -k 4 -k • • • -k 4 = d for all 1 < fc < W. 



for all I <i^j <t 
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The dealer then distributes the secret keys of the n participants Pi, . . . , P„ 
in such a way that Pi, 1 < (. < n, holds 



4 









Signature Generation. Generation and verification of partial signatures is by 
using the perfect hash family and reducing to the underlying {t, t) scheme. 
Assume that participants {Pi]i € A}, where A C n} and |A| > t, 

want to generate the signature for a message m. Without loss of generality, 
we assume that |A| = t and A = t} From the property of the 

minimal perfect hash family it follows that there exists a function fk G P 
such that fk restricted to A is one-to-one. To generate a signature for a 
message m, any pair of participants Pi and Pj in the group carry out the 
following procedure. 

— Pi broadcasts m and the auxiliary values m ^ , m , 



— Pj verifies if rn^ fkG).i = , If yes, Pj accepts 

as a correct partial signature for m from Pi. 

— ‘Filter-out’ the cheating participants and substitutes them with new par- 
ticipants to form a new group of size t, and perform the process according 
to the above two steps until t correct partial signatures are formed. The 
RSA signature is obtained by multiplying the correct partial signatures. 



Theorem 2. Under the assumption that factoring is intractable the above 
scheme is a secure t-robust and {t, n) threshold RSA signature scheme for any 

Proof, (sketch:) Completeness and soundness is straightforward from the prop- 
erties of perfect hash families and the completeness and soundness of the under- 
lying (t, t) threshold signature schemes. 

We are left to prove the security of the scheme. The proof of security is by 
using a simulation argument for the view of the adversary and showing that an 
adversary who has access to all the key information of the corrupted participants 
and the signature on m could generate by itself all the other public information 
produced by the protocol. Observe that due to the independence of the W runs 
of the underlying (t, t) threshold signature schemes and the properties of the 
perfect hash family, an adversary who corrupted up to t — 1 participants will 
have no advantage with respect to an individual run of (t, t) scheme. Thus it is 
sufficient to show that the adversary can not break any of the (t, t) schemes. 

In the following we show the security of the underlying (t, f) scheme. Without 
loss of generality, assume that an instance of the underlying (t, t) scheme consists 
of t participants P\,...,Pt and an adversary A who has corrupted the first 
t—\ participants Pi, ... , Pt-i and has learned their secrets. We give a simulator 

^ The assumption effectively means that if |A| > t we can choose a B C A| such that 
\B\ — t and apply the protocol to B. Also, the t participants are not necessarily the 
first t participants 
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SIMU for our scheme. The input to the simulator SIMU is the message m and 
its signature (mod N). However, the secret information held by Pt is never 
exposed and is not simulated. 

The SIMU works as follows. 

1. Key Generation 

- Chooses Ji, .. Gfl [</>(./V)], 

— Executes simulation of key generation of Gennaro et al partial signa- 
ture verification protocol. At the end of the distribution phase of the 
simulation, each Pi holds the following values, where i = — 1. 

(a) di Gr 7 ^ 4 ,{n)] 

(b) Auxiliary authentication values yip, . . . , yip S Z; 

(c) Verification data uip, . . . , Vt,i where = (&p,i, Cjp) 

such that hj^i Gr [iV‘^i],cpp Gr y. . _ djhj^i + Cj^i. Note 

that Vj^i is used to verify the correctness of partial signature of Pj. 

2. Signature generation 

— Computes partial signature di = (mod N),i= 1 and the 

auxiliary value Yij = (mod N),i= 1, . . . , t — 1, j = 1, . . . , t. 

— Sets dt = rrU / Oi^i (mod N) and the auxiliary values ftp, . . . , ltp_i 

by 

Yt,k = , fc=l,...,t-l. 

— Executes simulation of partial signature verification protocol. 

It is straightforward to verify that the view of the adversary A on execution 
of the protocol, and its view on execution of SIMU are statistically indistin- 
guishable, and so the result follows. 

Efficiency 

Now consider the efficiency of our scheme. The size of the key used for sig- 
nature generation is logiV which is the same as the regular RSA key, and is 
much better than that of the Rabin’s scheme [30] which is 2 log nN and that of 
Gennaro et al scheme [25] which is n log N. 

The memory requirement for each participant in our scheme is 2IT(1 -|-(5i-|- 
62 )tlogN, where W is the size of the minimal perfect hash family, and for 
Gennaro et al’s and Rabin’s scheme are 2(1 -I- di -I- S 2 )n‘^ logiV and 2(2 -I- <5i -I- 
52)^1 log nN), respectively. Thus the memory requirements of our scheme depends 
on the size of the perfect hash family. It is known that when t is small (relative 
to n), minimal perfect hash families with W = O(logn) exists. This means that 
for large group sizes with small threshold values, storage requirements of our 
scheme is much better than those of Gennaro et al’s and Rabin’s schemes. 

6 Conclusions 

In this paper we addressed a weakness of Rabin’s robust {t, n) threshold RSA 
signature scheme and proposed a new approach to the construction of efficient 
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and robust threshold RSA signature schemes. Our approach provides an efficient 
construction for robust {t, n) threshold RSA signature only when the threshold 
t is small compared to n. It remains open for the efficient solution to the generic 
systems, in particular, to the optimal resilient case, that is n = 2t + 1. 

The existing RSA threshold signature schemes require a trusted dealer, who 
knows the secret exponent and of the RSA system. Eliminating trusted 

dealer from distributed RSA is studied by Boneh and Franklin [8], Blackburn, 
Blake-Wilson, Burmester and Galbraith [5], Frankel, MacKenzie and Yung [24], 
and Miyazaki, Sakurai and Yung [29]. 

An interesting extension of our scheme is to distribute the computation and 
provide efficient construction for systems without the trusted dealer. 
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Abstract. We consider methods for threshold RSA decryption among 
distributed agencies without any dealer or trusted party. We present 
two methods: One is based on the previous two techniques by [FMY98] 
and [FGMY97]. It demonstrates the feasibility of combining the dis- 
tributed key generation and the RSA secure function application. The 
other method [MS99] is newly developed technique based on [FMY98] 
and further inspired by Simmons’ protocol-failure of RSA (we believe 
that it is very interesting that a “protocol failure attack” be turned into 
a constructive method!). The latter requires less “distributed computa- 
tion” as the key is being set up and it can be more smoothly incorporated 
into the existing distributed key generation techniques. 



1 Introduction 

The area of distributed cryptography has been very active in the last few years. 
In particular threshold cryptosystems and proactive cryptosystems have been 
developed to allow for distribution of the power to perform signatures or de- 
cryption in an organization. It is a very interesting key management technique 
where the outside world does not get exposed to the internals of the organization 
(see surveys in [Des92,FY98]). 

The distributed RSA systems were developed but (unlike the case of Dis- 
crete Log based distributed systems which was known for a while but was cor- 
rected recently) the key generation was assumed to be done by a centralized 
dealer [DF91,DDFY94,FGY96,GJKR96,FGMY97,R98]. The issue of initiating 
and further operating using a distributed parties was open for a while. 

Boneh and Franklin [BF97] changed this situation and showed how a set of 
three or more participants can generate an RSA function distributedly. Their so- 
lution assumed honest parties. It was generalized to withstand faults by Frankel, 
MacKenzie and Yung [FMY98] extends Boneh-Franklin’s scheme of (n, n) to ro- 
bust (t, n) threshold scheme. In the scheme of Frankel-MacKenzie-Yung. The 
scheme was motivated as an initiation of a distributed RSA service where the 
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key is never held at a single location. How to connect the scheme of generation 
to further signing was not explicitly expressed in their work. We take the step 
to validate that indeed we can connect the signing methods with the generation 
methods. Indeed, with the strong “share representation” modification techniques 
from [FGMY97] changing key representation and adapting representations be- 
tween schemes looks doable. In fact, one of our method is essentially realizing 
such a scheme, demonstrating how to start a distributed RSA service from “key 
birth” to its actual usage. 

Our Contribution: 

We believe that closing the various gaps which were left over and assuring that a 
complete distributed RSA service can be performed is quite an interesting issue, 
which we cover in this work. We, in fact, give two techniques to treat the problem 
of bridging the “distributed RSA application” to a starting step of “distributed 
key generation” (i.e., with no dealer). We give RSA function application (signing 
and decrypting) which is initiated directly from the distributed generated RSA 
key. 

We present two methods. The first is based on the previous two techniques 
by [FMY98] and [FGMY97]; this checks carefully the feasibility and feels the 
details on how to start the RSA service from the distributed generation. The 
other method is newly developed technique which, interestingly, is inspired by 
Simmons’ protocol-failure attack on RSA. 

The latter method requires less “distributed computation”: especially in 
adapting the key generation to function application period. Thus, our newly 
proposed scheme seems overall more efficient compared to the combination of 
the previous methods. The former method, however, requires less computation 
at the combiner- so may be useful for small devices at the combining function. 

Note that such comparison of these two schemes indicates a new measure of 
the performance of a distributed cryptographic protocol that consists of multiple 
stages, as we point at the need to look at the combined performance. Tradeoffs 
have to be assessed based on the computational context and we follow this. 
Related Works: 

Shoup [Sho99] independently presented threshold RSA-signatures by using the 
GGD-computation as our second method. Shoup’s model requires a trusted 
dealer, whereas ours assumes no trusted dealer. In Shoup’s schemes, the size 
of an individual signature is bounded by a constant times the size of the RSA 
modulus. Our second method achieves the similar property, while the size of an 
individual share increase as the number of the distributed agencies and as the 
number of the threshold. 

2 The Starting Point: Threshold (Proactive) RSA with a 
(Trusted) Dealer 

Frankel, Gemmell, MacKenzie and Yung [FGMY97] presented a protocol for 
RSA function sharing in the dealer-model. We start be reviewing their scheme. 
(See [FGMY97] for the detail.) 
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2.1 Key Distribution 

The dealer distribute a secret key to the n shareholders with threshold t. 

The dealer first generates a (variant of) RSA public key (e, N) and its secret 
key d. Let L = {n — 1)! and H = gcd(e, L^). (Indeed [FGMY97] and [FMY98] 
describe L = nl. However, a smaller (n — 1)! is OK.). Then, the dealer computes 
(P, s) such that eP+ -^s = 1 by using the extended Euclidian algorithm. Next, 
the dealer computes k such that d = P+L"^k ( mod (f>{N)). Note that the relation 
k = dsH~^ (mod </>(iV)) holds. 

Now the dealer chooses a random polynomial 

fix) = fo + fix + f 2 x‘^ + . . . + ft-lX*~^, 

where fj {0, P, . . . , 2P^n^+'^t}(l < j < t — 1) and /(O) = 

Finally, the dealer computes Sj = f{j), then secretly sends {P,Sj) to each 
shareholder Pj. 



2.2 Distributed Decryption 

We consider how to decrypt by a set of t shareholders among the n shareholders. 
Suppose a client has a ciphertext 



C=M^ (mod N) 



encrypted with the dealer’s public key {e,N). The client wants to decrypt this 
encrypted message C by asking t shareholders, A. 



Step.l: The client sends C to each shareholder Pj (g A). 
Step. 2: Each shareholder Pj first computes (jj = SjXj^A 
secret Sj, where 



a,a= n 



I 

l-J 



with his (distributed) 



, then computes Zj = (mod N). Each shareholder Pj secretly sends Zj 
to the client via secure channel. 

Some shareholder should send also (mod N) to the client (Note that 
this might be redundant option, because anybody can compute (P, s).) 

Step. 3: From received pieces Zj and (mod N), the client decrypts the mes- 
sage M as follows: 



C^Y[Zj = 

= M®"* 

= M (mod N). 



We remark that the scheme described above can be applicable to RSA- 
signature schemes, where the client asks to get agencies signature M‘^ on his 
selected message M (thanks to the same mechanism of RSA-signing as RSA- 
decoding) . 
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3 What Is the Technical Obstruction ? 

We would like to start distributed decryption and signature based on the dis- 
tributed generation (of course, without reconstructing the key at a centralized 
dealer!). 

Now we discuss the model without a dealer. The scheme in [FGMY97] as is 
cannot be applicable to our case, because [FGMY97] assumes a trusted dealer. 

Frankel, MacKenzie and Yung [FMY98] proposed methods for distributed key 
generating and (t, n)-sharing based on the Boneh-Franklin’s technique [BF97]: 
first key generating of (n, n)-sharing, then compute a polynomial of sum w.r.t. 
the secret key d. Each each shareholder Pj plays as a dealer to the shared secret 
dj and do the transform of “Sum-to-Poly,” After (t, n)-sharing of the secret d, 
any t members of the shareholders can recover the secret key d. 

Now again suppose that a client has a ciphertext 

C= (mod N) 

encrypted with the dealer’s public key (e, iV). The client wants to decrypt this 
encrypted message C by asking t shareholders, A. In the direct method by 
[FGMY97] to this case, each shareholder Pj needs to compute the coefficients of 
Lagrange’s interpolation 

= n 

Note, however, that no single shareholder knows the complete factorization of 
N . Thus, shareholders cannot compute the inverse of I — j(f>{N). Namely, is 
not obtained. Then, simply as described in [FMY98] (without some additional 
measures), shareholder cannot execute the distributed computation: 

JJ- (jsrXj.A = = M (mod N) 

jGA 

The paper is dedicated to feeling this gap and to showing how to actually do it. 

4 Scheme A: Combining [FMY98] with [FGMY97] 

In our first designed scheme, we combine the previous two schemes [FGMY97], 
[FMY98]. This checks carefully the fact that indeed the distributed key gener- 
ation can serve as a starting point for a distributed RSA service where the key 
is never known to any party. This feasibility demonstrates the strength of the 
“sharing representation” modification techniques. The method takes the follow- 
ing steps. First, each agency uses the [FMY98]-method for (n, n)-sharing in the 
model of no-dealer. Second, each agency plays as the dealer for his shared secret 
by using the [FGMY97]-method. 




On Threshold RSA-Signing with no Dealer 201 



4.1 Key Generation and Secret Key Sharing 

Step 1 : Do distributing key generation of (n, n)-sharing in the model of no- 
dealer by using [FMY98]-method. Now, each agency Vj keeps his shared 
secret dj satisfying 

d = di d2 • dn 

Step 2: Each agencies Vj plays as the dealer for his shared secret key dj by 
using the technique of [FGMY97]: "Pj computes {kj, Vj) such that dj = L“^kj+ 
Vj, then choose a polynomial 

fj{x) = L^kj + fj^ix H h fj^t-ix*~^ 

Vj sends Sjj = fj(i) to Vi {1 < i < n). (In fact, commitment scheme and 
Pedersen’s sharing which are both unconditionally hiding, are used). For 
each shareholder but the last, Vj is sent to the last player with a public 
commitment. The last player adds all the shares it got and this is his new 
share. He computes P using the extended Euclid and his share is represented 
as dn = L?kn + P, he distributes and proves that it equals the summed 
shares plus its own original minus P. 

Step 3: Each agency Vj verifies the correctness of Sij received from n agencies. 
If this is OK, computes Sj = Note that Sj is the value 

Sj = F{j) of the polynomial F{x) 



F{x) = L^K + Fix + • • • Ft-ix, 
where K = X)”=i % and F* = J2j=i fj,i (1 < * < ^ - 1) 



4.2 Distributed Decryption/Signature 

Any t shareholders of n can decrypt/sign. Let A be one group of t shareholders 

SetY = E”=iO- 

Step 1: U sends the ciphertext C to each shareholder Vj. 

Step 2: Each shareholder Vj (j G A) execute the following: 



- n j— 

ieAIj} 

aj = sum — to — sum{oi, i G A) 
Zj = (mod N) 



(In the above, the sum-to-sum assures that the individual fresh shares have 
the property that their indeed sum of the as stays the same, but each sub- 
sum is completely random). 

Then, send the result Zj to Id. 
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Step 3: hi recovers M from the public information V and t results: 

j&A 

(L^kj+vj) 

= M (mod N) 



4.3 Notes 

— Actually not n but only t + 1 players need to generate the key. If one mis- 
behaves it is eliminated and the process restart. There are at most t misbe- 
having parties and at the end t -|- 1 where one of them is honest distributes 
to the n parties the shares of d. 

— Decryption/signing is as in [FGMY97]. 

— Note that, in signing version, the ciphertext C is replaced by the message 
M. 

— In Scheme-A, at the key generation and the secret sharing, on Step. 3 each 
shareholder should check the correctness of all Sij that received from Vi 
( for all i). This is for Robustness. Also share randomization needs to be 
supported by robustness tools (see [FGMY97]). 



4.4 Security 

The initial distribution is secure, due to the distributed key generation of 
[FMY98] (given an adversary we can simulate the view). Then the distribu- 
tion of the di is robust and results in a t-out-of-n sharing of a L'^k and the same 
P is the public part. This is simulatable using the sum-to-poly arguments in 
[FGMY97] and the public unconditionally concealing commitments. The adver- 
sary controlling at most t agents has a view which is simulatable. The signing 
operation is simulatable as well as in [FGMY97]. 

The operations are robust, from [FGMY97,FMY98], we can add checks and 
elimination of misbehaving parties. 



5 Scheme-B: Based on GCD-Decoding 

In our second presented scheme, the client decodes the ciphertext by using the ex- 
tend Euclidian algorithm. This trick is quite known as in Simmons RSA protocol- 
failure [Sim83]. For the signing part, we start without robustness assuming hon- 
est but curious behavior) and then we add robustness. 
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5.1 The Common Modulus Protocol Failure in RSA 

We review the common modulus protocol-failure in RSA schemes remarked by 
Simmons [Sim83]. The common modulus protocol-failure is that if the same 
message is ever encrypted with two different exponents under the same modulus, 
and those two exponents are relatively prime, then the plaintext can be recovered 
without either of the decryption exponents by using the extended Euclidean 
algorithm. 

Let m be the plaintext message. The two different encryption keys are e\ 
and 62. The common modulus is N . The corresponding two ciphertext are ci = 
rrf^ (mod N), C2 = (mod N). We consider the cryptanalyst who knows 
A, 61,62,61 and 62. The cryptanalyst is able to recovers in the following way. 
Since 61 and 62 are relatively prime, the extended Euclidean algorithm can find 
r and s satisfying the relation re\ + S62 = 1. Note that either r or s has to 
be negative, so we assume that r is negative. Then, the cryptanalyst can use 
again the extended Euclidean algorithm for computing c(”^. Thus, the plaintext 
message m is recovered by the computation that (ci)“’'(c2)® = m (mod N). 

Thus, each party must choose its own RSA modulus. This protocol failure is 
a negative aspect of RSA scheme, however, our new technique shows that the 
trick has a positive cryptographic application. 



5.2 Distributed Key Generation and Secret Key Sharing 

In the model of non-dealer, by using Frankel-MacKenzie- Yung’s secret key shar- 
ing scheme [FMY98], n shareholders Vj jointly generates RSA public-key (c, N) 
and share the secret key d with (t, n)-secret sharing which is simulatable (over 
a subset of the integers) where oi, . . . , at_i G Z, and: 

f{x) = d + a\x + a2x’^ -I- • • • -I- at-ix^~^ 

At this point they are ready to perform distributed operations. This part is 
robust and secure based on [FMY98]. 



5.3 Distributed Decryption Protocol 

The client U asks t-shareholders, A to decrypt the ciphertext C. Recall that 
L = (n — 1)! and gcd(c, = 1. 

Step 1: U sends C to each shareholders Vj{& A). 

Step 2: Each shareholders Vj computes: 



a,a= n 






l-J 



Oj = sum — to — sum{oi, i G A) 



Zj = (mod N) 

then send partial-result Zj to the combiner U. 




204 Shingo Miyazaki, Kouichi Sakurai, and Moti Yung 



Step 3: U computes based on the actual t shareholders: (mod N): 

Zj = (mod N) 

j&A 

Step 4: U decrypts M from a pair of the different ciphertexts (Ci,C 2 ) = 

[4a] : oi = (i^)“^ (mod e) 

[4b]: 02 = (oiL^ — l)/e 

[4c] : M = C“i(C'“^)-i (mod N) 

Note that this procedure is sound because 

C'“i(C' 2 ")”^ = = M (mod N) 



5.4 Security 

The initial distribution of d using Pedersen’s (in large enough subset of the in- 
tegers) based sum-to-poly is secure and simulatable as was shown in [GMY98] 
(so that t-1 shares can be picked at random. For arguing the security of signing: 
Given the outcome M (which the simulator has), then = C\ can be com- 
puted by the simulator and given t — 1 partial results Zj (which are t — 1 wise 
independent) we can use the equation in step 3 to compute the missing Zt (say), 
simply by dividing mod N the result M by the partial results. 

5.5 A Version for Signing 

Scheme-B can be modified for threshold signing. Instead of the ciphertext C, we 
give a message M to t agencies. Then, U get the following two data: 

r M = (mod N) 

(mod N) 

U computes {a, (3), by using the extended Euclidean algorithm, satisfying 

ea + L3(5 = 1 

U can get the signature with (a, (i): 
je/i 




On Threshold RSA-Signing with no Dealer 205 



5.6 Robust- Version 

Of course, key distribution and re-distribution are robust as in [FMY98], 
[FGMY97] . The client adds the following protocol to Step 2 for checking whether 
each shareholder Vj computes the partial result Zj with its share Sj correctly. 
Let 5 be a generator of Now, each party Vj publishes hj = (mod N) 
and the client has the partial result Zj from Vj . The client can verify the validity 
of each partial result by applying Pedersen’s technique [Ped91a]. 

Step 1: U chooses rji,rj2 G Z randomly and then computes 6 j = L'^Xj^A and 
Xj = (mod N). U sends Xj to each party Vj. 

Step 2 : Vj generates a random number aj G Z and computes Fji = X^^ ( mod 
N) and Yj2 = (mod N). Vj transfers (Y,i, Yj2) to U. 

Step 3: U sends {rj\,rj2) to Vj. 

Step 4: Vj check that Xj = (mod N) and transfers aj to U if and 

only if the validity of Xj is accepted. 

Step 5: U can verify the validity of each partial result Zj via the following 
formula: 

^ (mod N). 

6 Comparison: Scheme-A vs. Scheme-B 

We discuss the comparison between Scheme-A and Scheme-B. 

6.1 Extend Euclidian Algorithm 

In Scheme-B, the ciphertext is decoded by using the extend Euclidian algorithm 
at the last stage of the client as the same mechanism as the protocol-failure 
in RSA [Sim83]. We should note that the extend Euclidian algorithm plays an 
important role also in Scheme-A (and in [FGMY97]), in the key distribution 
stage. So, an apparent difference between these schemes is the stage when the 
extend Euclidian algorithm is executed, (see Table on the comparison). Scheme- 
A requires maintenance of long term keys as multiples of and Scheme-B gets 
rid of this requirement. 





[FGMY97J 


Scheme A 


Scheme B 


Model 


Dealer 


Dealer/Non-Dealer 


Dealer/Non-Dealer 


Euclidean formula 


eP + = 1 


same as FGMY97 (public P) 


eai — L^U2 — 1 


When is EA done ? 


At key distribution 


At key distribution 


Every Dec./Sig. 



EA: the (extended) Euclidean Algorithm 



6.2 Computation by Multi-parties vs by Single-party 

With respect to key generation/distribution stages, Scheme-B is somewhat sim- 
pler than Scheme-A. Scheme-A requires much distributed computation in its 
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set-up (due to divisibility conditions) . Given that a system is initiated distribut- 
edly where d is shared in the right domain, Scheme-B is simpler. Only in cases 
(e.g. palm-top devices [BD99]) when we may want to perform combining while 
minimizing multiplication and exponentiation, Scheme-A may be useful, because 
GCD combining is somewhat more costly than simply combining the partial re- 
sults in scheme-A where only one exponentiation and t — 1 multiplications are 
performed. 
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Abstract. This paper is to propose a threshold KCDSA (Korean Certification- 
based Digital Signature Algorithm) signature scheme. Eor this goal we present 
some new secret sharing schemes and a robust multiplication protocol based on 
the non-interactive ZK proof. The secret sharing scheme using a hash function 
can verify the validity of the received data easily, therefore we can reduce the 
complexity of the protocol. Also we present a robust multiplication protocol of 
two shared secrets based on non-interactive ZK proof scheme. Players check 
the validity of the broadcasted shares by using a non-interactive ZK proof, and 
accept only the correct shares, and then interpolate the polynomial by using the 
accepted shares. Einally we propose a threshold KCDSA signature, which is 
composed of key-sharing protocol and signature generation protocol. We prove 
that the proposed KCDSA signature is a (t, n)-robust threshold protocol, which 
tolerates up to t eavesdropping and halting faults if the number of players is 
n>2t+\- 



1 Introduction 

The highly advanced cryptography and the network security techniques have been 
developed during the past twenty years. In particular a number of cryptographic 
techniques have been introduced to multiparty computation problem for the secrecy 
and integrity of information. The combined techniques become extremely powerful 
tools for the cryptographic application, but some of these results have a lack in 
practical feasibility. 

Shamir firstly introduces the concept of secret sharing scheme in [2]. In Shamir’s 
scheme a misbehaving dealer can send the inconsistent shares to any participants, 
from which they will not be able to reconstruct a correct secret collaboratively. To 
prevent such a malicious behavior of the dealer, Feldman and Pedersen’s verifiable 
secret sharing schemes in [3,4,5] are proposed respectively. 

For the long lifetime of the shared secret, a proactive secret sharing scheme is 
presented in [6] . This technique refreshes the shared secrets to improve the security of 
the shared secrets during a time period. Namely, what is actually required to protect 
the secrecy of the information is to be able to periodically update the shares without 
changing the original content of a secret. Proactive secret sharing scheme is consisted 
of three protocols, i.e., the private key renewal protocol, the share recovery protocol, 
and the share renewal protocol. Also the on-line secret sharing scheme dynamically 
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can add more participants on line, without having to redistribute new shares secretly 
to the current participants [7,8,9]. 

These Secret sharing schemes are developed for the various applications. The 
threshold signature system for DSS, the Digital Signature Standard, was studied in 
[1,10] etc. Also, in particular approach to public key cryptosystem, the results of some 
studies were published, namely [11,12,13] for the case of RSA signatures, and [14] 
for ElGamal type of signatures. 

In this paper, we present new verifiable secret sharing scheme using a hash 
function and a robust multiplication scheme for two secrets. The presented secret 
sharing scheme allows for verifying the validity of the received values easily, but it 
can not check whether the received values are derived from some polynomial or not. 
However, in reconstruction phase we can check whether the shares are correct or not. 
If players find some problems, then a group of players perform a sharing phase and a 
reconstruction phase to detect the misbehaving players by using Pedersen’s VSS or 
Feldman’s VSS. Also we present a robust protocol to compute a multiplication 
protocol of two shared secrets. This protocol is using the non-interactive ZK proof. 

We describe the existing secret sharing schemes in section 2, and some basie 
protocols, such as a verifiable secret sharing using a hash function, and a robust 
multiplication protocol based on the non-interactive ZK proof are presented in section 
3. The basic operation of KCDSA is presented in section 4, and in section 5 we show 
how the presented schemes are combined jointly and securely to generate the KCDSA 
private/public key generation, and we present a robust threshold KCDSA signature 
scheme. Finally section 6 summarizes the results. 



2 Existing Secret Sharing Schemes 

In this section we review a few known secret techniques. 



2.1 Shamir’s Secret Sharing 

This scheme has a trusted dealer having authority in distributing his share to each 
player. If the dealer wants to distribute a secret a, then he chooses randomly a 
polynomial f(x) of degree t, such that f(0)= a. Dealer sends the computed shares 
p.=f{i), to players, for i=(l, ... , n). The coalition of t (or more) players can 

interpolate a polynomial to evaluate secret a. In general, this secret sharing scheme is 
denoted by > «. 



2.2 Verifiable Secret Sharing 

In this scheme, each player wants to verify the validity of the received information 
from other players. A lot of VSS schemes have been proposed so far. Theses schemes 
adopt the methods that commit to all coefficients of polynomials, and then each 
player verifies the received information from dealer (or others players) using the 
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committed infomation. The VSS scheme is based on computational secrecy, or on 
information theoretic secrecy. 



2.3 Joint Random Secret Sharing 

In a joint random secret sharing scheme all players aet as dealer, i.e. all players 
collectively choose their shares corresponding to a (t, n) threshold secret sharing of a 
random value. All players obtain their shares by adding the received partial 
information, and then the coalition of t (or more) players can interpolate to evaluate 
secret. Each player may verify the validity of the received information by using VSS. 



2.4 Multiplication of Two Secrets 

Given two secrets a and (i that are both shared among the players, compute the 
product a|3, while maintaining both of original values secret. In (t, n) threshold secret 
sharing scheme, given that a and (3 are each shared by a polynomial of degree t+1, 
each player can locally multiply his shares of a and (3, and the result will be a share of 
a|3 on polynomial of degree 2t. Namely, coalition of 2t+l (or more) players can 
interpolate the polynomial to evaluate secret a(3, further this scheme needs joint zero 
secret sharing for re-randomizing procedure. 

To overcome the above shortcoming, a new efficient protocol that reduces degree 
of polynomials and don’t have to use the re-randomizing polynomials in a single step 
have been proposed[l]. However, this scheme needs three joint random secret sharing 
and an inverse computation of a (2t-tl) by (2t-tl) matrix. 2t-i-l players must cooperate 
to generate the secret. 



2.5 Exponential Interpolation Scheme 

Given every player has a share of the secret information a, the exponential 
interpolation scheme is to interpolate the g “ instead of a. Each player distributes his 
share using the Shamir’s secret sharing scheme. The coalition of t-tl (or more) players 
can interpolate to evaluate secret g“ by using Lagrange interpolation coefficients. 



2.6 Proactive Secret Sharing Scheme 

In a (t, n)-threshold scheme, an adversary needs to compromise more than t players in 
order to learn the secret, and corrupt at least n-t shares in order to destroy the 
information. If the lifetime of secret is long, then an adversary may attack over a long 
period of time. Therefore, to protect long-lived secrets we need to periodically refresh 
the secrets. Proactive secret sharing scheme satisfying the requirements was presented 
in [6]. Proactive secret sharing scheme consists of three protocols as follows: 
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1 . The private key renewal protocol 

2. The share recovery protocol (including lost share detection) 

3. The share renewal protocol 

One can keep the KCDSA signature key securely for a long time by using the 
proactive secret sharing scheme while its shares can be refreshed periodically. An 
adversary trying to break the threshold signature scheme needs then to corrupt t 
servers in one single period of time, as opposed to having the whole lifetime of the 
key to do so. 



3 Basic Building Blocks 

In this section we present some basic VSS protocols. These protocols can be applied 
to the threshold KCDSA signature or other threshold applications. 



3.1 A VSS Using Hash Function 

In general, VSS based on Feldman or Pedersen’s secret sharing scheme will allow the 
players to check the validity of the received shares from the others players, but VSS 
have the shortcoming that amount of computation is too much. Feldman’s scheme 
guarantees that the security of the secret is only computationally secure; on the other 
hand Pedersen’s scheme guarantees a theoretic secrecy for the shared secret. 

For our protocol we need some assumptions for a hash function as follows: 

1 . H(») indicates a commitment function. We can denote a commitment function 
by a hash function. 

2. It is infeasible to find two strings x and y such that H(x) = H(y). This property 
is known as a collision-resistant property of the hash function. 

3. H(») should be easy to compute. 

4. H(«) should be one way. That is, given y=H(x) it is impossible to find x 
computationally, but y can compute easily in y=H(x). 

We assume that H(») is one way hash function satisfying above conditions. The 
security of H(») can however be only conjectured on the basis of the collision 
resistance of the hash function. Namely, we assume that H(»), i.e. one way hash 
function, is secure. 

In this subsection we present a VSS using hash function to reduce computational 
complexity. This protocol can verify the correctness of received shares by using the 
hash function, but it can’t prove that the committed values are obtained from 
polynomial f(x) of degree at most t, such that f(0)=s. In Figure 1, FI means a general 
arbitrary hash function. 
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1 . Sharing Phase 

Choose two random polynomials of degree t. 

f(x) = aix‘+ — i-aiX+a„ 

where, a, = H { a ,_[ , ••• , a^, a^) 
g{x) = bfx‘ + ••• +b^x+b(^ 
where, b, = H , ••• , b^, bg) 

Compute the following values. 
a,=f(i), e, = g(0 
Compute Ej = H(aj , ),j = 1, •••,«■ 

Compute mod/?, ydr/=0,-”,f /=1, •••,«. 

Broadcast k A 

i ’ J 

2. Reconstruction Phase 

Collect more t+1 shares. 

Interpolate f (x) and g(x) 

Compute d,. = / (/) and c,. = g(/) 

Check the following equations. 

7 

a, = H (a,_[ ) 

Ej=H(aj,ej) j = \,---,n 

If this test passes, then the recovered secret is regarded as 
a = /(O) , else, check the following equation to identify who 
is a false player using the published information. 

g‘^‘lf‘=Y\Af i = \,-,n 



Fig. 1. Our VSS using Hash Function 

In the following we will refer to this protocol as VSS-H. The detailed protocol 
described in Figure 1 . 



3.2 Joint Unconditionally Secure RSS Based on Pedersen’s VSS 

In this section we present a modified joint unconditionally secure random secret 
sharing scheme based on Pedersen’s VSS. In this scheme we commit to the values of 
polynomial instead of the coefficients of polynomial. We use a hash function to 
commit to the values of polynomial. VSS using a hash function can verify the 
correctness of received shares easily, but it can’t prove that the committed values are 
obtained from the specified polynomial of degree at most t. Players commit to 
coefficients of polynomial, because the participants which take part in a 
reconstruction phase want to verify the broadcasted shares by t+1 (or more) players. 
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1 . At sharing Phase, each player 

- Choose two random polynomials as follows. 
f{x) = a„ x' + •••+a,,iX+a,,o 

Siix) = b., x‘+---+b.^x+b.^ 

- Compute the values, OC^ j and j as follows. 

= fi 0')’ Pi.j = Si (y ) 

- Compute the following two values and broadcast them. 

Ei,j = H {a,j , p,j ), A,j = g (l<i<n, l<j<n) 

- Send Cf j j and p. . to the other players and at the same time receive CC j , 
and p.. from the other players. {\<i<n, 1< J^n) 

- Check the correctness of the received values by using the verification 

7 

equation, i.e. Ej , = H {Uj , , Pj , ) . If this test passes, then proceed the 

next step, else check the following equation. If the test doesn’t pass, the 
player opens his share and stops the protocol. 

k=0 

- Compute his own shares 

” def ”, def 

Si = '^aj i s(x), e, = <r^ e(x) 

J=l ;=1 

2. At reconstruction phase, 

- Each player broadcasts the values Sj , . 

- Each player check the received Si , by using the following equation. 

y=o /=(' 

- Collect the ( t + 1 ) shares which were passed the above test. 

- Interpolate §{i), e(i), s(x), e(x) and compute f(i),g(i). 

- Check the following equation again. 

J=o /=0 

- Pass s(0) as a secret information 
Fig. 2. Joint-Uncond-Secure-RSS based on Pedersen’s VSS 

The presented scheme can check the validity of the received data using hash function 
in sharing phase, and then verify the validity of the broadcasted shares using 
Pedersen’s VSS in reconstruction phase. Namely, if no faults occur then players don’t 
have to compute the exponentiation operation in sharing phase. In the following we 
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will refer to this protocol by Pedersen- VSSH. The detailed protocol described in 
Figure 2. 



3.3 ZK Proofs for Multiplication of Committed Shares 

The robust multiplication protocol was presented based on an interactive ZKIP[1]. In 
this section we describe their idea again, and extend their scheme to generate a non- 
interactive robust multiplication scheme. The protocol is to prove that a player proves 
to verifier that he knows the discrete logarithm value and the committed value is 
constructed properly as shown in the Figure 3. This scheme can be used in a non- 
interactive robust multiplication protocol in case of the number of player involving in 
that protocol is 2H-1. 

3.3.1 Interactive ZK Proof 

The prover wants to prove to the verifier that he knows how to open such 
commitments and the opening of C that he knows is really the product of the values 
he committed to in A and B. Firstly, the prover publishes A, B, and C whose values 
are g“h'’ , ^ ^nd respectively, and then he proves to the verifier that he 

knows the polynomial representation of A = . To verify that the prover knows 

the polynomial representation of A = g'^h^ which is known to verifier, two players 
performs the following protocol; 

1. The prover chooses d and s at random. He sends to the verifier the message 
M = g'^/t", for d, 

2. The verifier chooses a random number e, for ee„ Z , and sends it to the 

K (J 

prover. 

3. The prover computes the values, y and w, as follows: 

y = d + ep, w = s + ea 

4. The verifier checks the equation as follows: 

Also, the prover can prove to the verifier that he knows the polynomial representation 
of B = g^h^ in a similar way as the above procedure. The prover proves that he 
knows the polynomial representation of C = and that the exponent is the 

multiplication of two known secret as Figure 3. Figure 3 describes the detailed 
protocol. 
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Fig. 3. Interactive ZK Proof to prove that the exponent is the multiply of two known values 



3.3.2 Non-interactive ZK Proof 



We present a non-interactive ZK proof as shown in Figure 4. Figure 4 is based on 
computational secrecy, and Figure 5 is based on information theoretic secrecy. In the 
following we will refer to protocol in Figure 5 as Non-interactive-ZK. The detailed 
protocol describes in Figure 4 and Figure 5. 



prover verifier 

Choose c,d at mndon\ I 



compute 
M = g“ 
M,=g‘ 
M2 =B‘‘ 

compute 

e = h{g\\M\\M,\\M2) 



compute 
y = d+ea 
z = x+eP 



► 

M,Mi,M2^e,y,w,z,Wi,W2 



verify 

gy =g‘‘*^« =mA>’ 

gy =gyyyt> 



Fig. 4. Non-interactive ZK Proof based on computational secrecy 
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prover 

Choose d,s,x,s ,s^ I 
at random 



compute 
M = 

= g -^/1 

M^=g^h''^ 



verifier 



compute 

e = Hg\\h\\M\\Mi\\Md 



compute 
y = d + efi 
w = s + e<T 
z = x+ea 
Wj = + ep 

Wi = S2 +e(T-(Ta) 



>- 

M ,Mi,M2,e, y, >v, z,Wi,W2 



verify 

g^¥'=MBi’ 
g^h'^ =M2C‘ 



Fig. 5. Non-interactive ZK proof based on information theoretic secrecy 



3.4 Robust Multiplication Protocol Based on Pedersen’s VSS 

In this section we show how to carry out the robust multiplication protocol based on 
Pedersen- VSSH and Pedersen-ZK. In the following we will refer to this protocol by 
Mult-ZK. Figure 6 indicates this protocol. 



1 . Sharing Phase 

- Distribute a and p using Pedersen- VSSH 

- Multiply O', by fj. , and distribute a.yj. 

- Broadcast C = g“'^‘ h^‘ ■ 

2. Reconstruction Phase 

- Perform the robust multiplication protocol, non-interactive-ZK, where 

A=g“’ h’’’ and B=g^‘ h°‘ are the published information at the sharing 

phase, and the value C=g^‘^‘ h^‘ is broadcasted. 

- Collect the correct ( 2t H- 1 ) shares. 

- Interpolate „p . 



Fig. 6. Multiplication protocol of two secrets based on Pedersen-VSSH 
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4 KCDSA Signature 



KCDSA (Korean Certification-based Digital Signature Algorithm) is Korea standard 
digital signature algorithm [16]. In this section, A KCDSA is composed of public 
information p, q, g, a public key y and a secret key x, where; 

1. p ; a large prime that the length of it is 512-n256i, for i=(0, ... ,6}. 

2. q ; a prime factor of p-1 that the length of it is 128-t32j, for j=(0, . . . , 4|. 

3. g : a base element of order q (mod p). 

4. X ; the signer’s private signature key such that xe Z ■ 

r q 

5. y : the signer’s public verification key, such that y = ' mod p 

6. H ; collision resistant hash function, such that the length of output is q. 

7. Z : hash value of the signer’s certification data. 

The detailed protocol is shown in Figure 7. The signature pair of the hashed 
message m is a (c,s) , where Z is the certificate of a verifier. The signature generation 
procedure of KCDSA is described in left part of Figure 7, and the verification of 
signature is described the right part of Figure 7. 



signer 



verifier 



k = Z 

w = g‘ mod p 
c = H{w) 
h = H(Z,M) 

E = h®c 
s = x(k-E) modg 



(c, i'), M 



h = H{Z,M) 

E = h®c 
w = mod p 

H(w') = c 



Fig. 7. KCDSA Signature Generation and Verification 



5 Robust Threshold KCDSA Protocol 

In this section we present a robust threshold KCDSA protocol for generating a 
distributed KCDSA signature. The robust threshold KCDSA protocol consists of two 
phases, one is a key distribution phase, and the other is a signature distribution phase. 



5.1 Distributed Key-Pair Generation Phase 

In this phase each player shares his private key x and public key y (=g* mod p)- 
The detailed protocol describes in Figure 8. 
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1 . Distribute private key X 

The players generate his share corresponding to a secret X , which is 
uniformly distributed in , with a polynomial of degree t , by using 
Pedersen-VSSH 

(Xj , • • • , > X mod q 

2. Compute and broadcast the public key, y — moAp 

(a) The players generate a random value a, uniformly distributed in Z ^ 

with a polynomial of degree t , using Pedersen-VSSH. 

(b) The players generate random polynomials of degree 2 1 with constant 
term 0, Denote the shares by {b^} /={1, ••• ,n} 

(c) Player P. broadcasts -I- mod q using Mult-ZK. 

(d) Player p. computes each player. 

- Interpolate V = ax mod q 
-Compute C ‘ = (flx) ‘ mod q 

- Compute {g“Y ' = ' = g^' rnod p 

Fig. 8. A Distributed Key-pair Generation Phase of KCDSA 



1 . Generate k 

The players generate a secret value k , uniformly distributed in Z ^ , by 
running Pedersen-VSSH with a polynomial of degree t 
(^j , • • • ,k^) < > k mod q 

2. Perform Joint Zero Secret Sharing 

The players generate random polynomials of degree 2t with constant 
term 0, Denote the shares created in these protocols as 
{C,.} i = {\, •••,«} 

3. Compute the values 

- Interpolate w = g^ mod p 

- r = H(w), h = H{Z,M), E=h@r 

4. Generate s = x{k-E) mod q 

- Player P- computes = X^ {k^ — E)+ mod q using Mult-ZK 

- Interpolate the values, s — x(k — E) mod q 

5. Output the pair (r, s) as the signature for M 



Fig. 9. A Robust Distributed Signature Phase of KCDSA 
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5.2 Distributed Signature Phase of KCDSA 

In this phase each player establishes the signature for a message M. The detailed 
protocol describes in Figure 9. 

Lemma 1. KCDSA-Thresh is a (t, n)-robust threshold KCDSA signature protocol, that 
is, it tolerates up to t eavesdropping and halting faults if the number of players is 
n>2t + \. 

Proof. Reviewing the proposed protocol can easily prove this lemma. 



6 Conclusion 

Threshold signature scheme allows a group of players to produce a signature rather 
than by one player. In this signature scheme, the secret key is shared by a group of 
player. And the public key is published by a group of players. Each payer should 
generates his partial signature to produce a complete signature given message m. This 
signature scheme is applied to a certification authority in a public-key infrastructure, 
where each signature system in a certification authority has a shared secret key 
corresponding to his public -key. 

In this paper we present some basic protocol, i.e. a verifiable secret sharing scheme 
using hash function and a multiplication protocol of two secrets based on a non- 
interactive ZK proof. Also we apply these protocols to threshold KCDSA signature 
scheme. KCDSA-Thresh is a (t, n)-robust threshold KCDSA signature protocol, that 
is, it tolerates up to t eavesdropping and halting faults if the number of players is 
n'>2t + \. The proposed VSS and robust multiplication protocol can be applied to 
any threshold cryptosystem. 
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Abstract. In this paper, we present algorithms, suitable for hardware 
implementation, for computation in the Jacobian of a hyperelliptic curve 
defined over GF(2“). We take curves of genus 3 and 6, designed by using 0.27- 
um CMOS gate array technology, and estimate the number of multiplication 
operations and the size and speed of hardware based on the proposed algorithm. 
It is shown that hardware for genus 6 curves computes an addition (resp. 
doubling) operation in 100 (resp. 29) clock cycles and can work at clock 
frequencies of up to 83 MHz We also compare a hyperelliptic curve 
cryptosystem with RSA and elliptic curve cryptosystems from the viewpoint of 
hardware implementation. 



1. Introduction 

Koblitz [Ko88, Ko89] investigated the Jacobians of hyperelliptic curves defined over 
finite fields, and proposed hyperelliptic curve cryptosystems. Frey [FR94] showed 
that the discrete -logarithm problem of Koblitz’ s hyperelliptic cryptosystems can be 
solved in sub-exponential time, and Sakai, Sakurai, and Ishizuka [SSI98] and Smart 
[Sm99] studied the Jacobians of hyperelliptic curves and found Jacobians that are 
secure against all known attacks. 

Explicit formulas for addition in the Jacobians of hyperelliptic curves were 
introduced by Cantor [Ca87] and Koblitz [Ko88, Ko89]. Although the formulas for 
addition in Jacobians are more complicated than those for the addition in points on an 
elliptic curve, if the order of a Jacobian of a hyperelliptic curve has the same size as 
the order of points on an elliptic curve, the ground field of the Jacobian is smaller than 
that of the elliptic curve. This is an advantageous feature for hardware 
implementation. In addition, the multiplication operation for polynomials used in the 
formulas can be effectively performed by parallel-processing hardware. 

JooSeok Song (Ed.): ICISC^g, LNCS 1787, pp. 221-235, 2000. 

© Springer-Verlag Berlin Heidelberg 2000 
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The objective of this paper is to investigate how effectively hyperelliptic curve 
cryptosystems can be implemented by hardware means. 

First, we explain the algorithm[TM99] based on the algorithm introduced by 
[Ca87] and [Ko88, Ko89], and discuss the number of multiplication operations from 
the viewpoint of hardware. For explanatory purposes, the Jacobians associated with 
curves C\y^ + y = x' /GF(2) of genus 3 and C\ y^ + y = + x'' + x^ + 1 /GF(2) of 

genus 6 were chosen from the Jacobians proposed in [SSI98] and [Sm99]. 

Recently, Gaudry gave a new algorithm(Gaudry’s variant) for discrete logarithm 
problem on hyperelliptic curves [Ga99]. Duursma, Gaudry, and Morain [DGM99] 
also presented a method for speeding up discrete log computations on curves having 
automorphisms of large order. This method uses a parallel collision search and 
obtains a speed of if there exists an automorphism of order m. Its author’s 
heuristic analysis says that the attack would be effective for curves with genus > 4. 
Therefore, hyperelliptic cryptosystems defined over Jacobians with the curve of genus 
3 has the same level of security as 160-bit-key elliptic curve cryptosystems(ECC), but 
hyperelliptic cryptosystems defined over Jacobians with the curve of genus 6 is 
weaker than 160-bit-key ECC. However, our result are independent from the specific 
structure of such a curve with large automorphisms and the security levels of the 
cryptosystems, and any technique of our implementation is valid for the curves 
without automorphisms of large order. 

Next, we describe the result of the logic design and synthesis, which uses 0.27-um 
CMOS gate array technology to estimate the size and speed of the hardware. 

Einally, we analyze the dependency of hardware efficiency on the genus of the 
curve. The efficiency is defined as the result of dividing the speed by (size)*(power 
consumption) and compare our results with RSA and to elliptic curve cryptosystems. 



2. Preliminaries 



Let kT be a field, and let K denote its algebraic closure. We define a hyperelliptic 
curve C of genus g over K to be an equation of the form y^ + h(x)y =f(x), where h(x) 
is a polynomial of degree at most g and/(x) is a monic polynomial of degree 2g -H 1. 
We are concerned in this paper with finite fields of characteristic 2. The point P(x,y) 
generates the free group of divisors. A divisor ZJ is a finite formal sum of K -points D 
= X niiPi nii G Z. We define the degree of D as deg(ZJ) = X m,. The divisors form an 
additive group, in which divisors of degree 0 form a subgroup DO. The rational 
function r has a finite number of zeros and poles on C. We associate r with its divisor 
(r) = X nii Pi , where F, are poles or zeros with multiplicities m, . A divisor of a 
nonzero function, such as (r), is called a principal. The principal divisors form a 
subgroup of DO. The Jacobian variety is defined as the quotient group of Jc(^ = DO/P 

Let Eqbe a finite field with q elements. The discrete logarithm problem of Jc( F^n ) is 
the problem, given two divisors D\ and D 2 defined over F^n , of determining an 
integer m such that D 2 = niDi, if such an m exists. 
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3. Proposed Algorithm 



3. 1 Computing in Jacobians 

An element of Jacobian varieties can be represented uniquely by a reduced divisor. 
Any reduced divisor is regarded as a pair of polynomials satisfying deg b < deg a and 
deg a <g. We give a brief description of an algorithm introduced by Cantor [Ca 87 ] 
and [Ko88, Ko 89 ] for addition: D3 = Oi -H O2, where O3 = div(c(3, bs), D2 = div(c(2, ^2) 
and Di = div(fli, bi). 

First, we compute the greatest common divisor (GCD) of polynomials ai and 02. 
Note that the case in which gcd(fli, 02) = 1 is extremely likely if the ground field K is 
large and ai and 02 are the coordinates of two randomly chosen elements of the 
Jacobian. The case in which a\ and <22 are not prime does not strongly affect the 
performance. Therefore, this paper investigates only the case in which gcd(ai, 02) = 1 
and the doubling Di = D2 case. We assume that other cases are processed by software 
with the assistance of hardware. 

We use the extended Euclidean algorithm and compute d = gcd(fli, 02) and two 
polynomials Si and ^2 satisfying the equation iiOi + ^2^2 = d. For convenience, Si and 
^2 are divided by d to meet the condition that s^Oi + S2Ci2= 1- 

The extended Euclidean algorithm is also used when we compute the error location 
and evaluation polynomials from the syndrome in decoding of Reed Solomon code. 
This is a powerful error correction code that is widely used hy storage devices and 
communications, and is frequently implemented hy hardware means. In decoding of 
Reed Solomon code, the GCD, d, is an error evaluation polynomial, Si and ^2 is an 
error location polynomial, and only one of Si and ^2 is needed. The difference between 
decoding and addition is that decoding of Reed Solomon code requires only one of Si 
and ^2, whereas, addition in a Jacobian requires both Si and ^2- 

There are several different implementations for decoders using the Euclidean 
algorithm [IDI 95 ], [JS 99 ]. Here, we take the simplest one, to maximize the 
parallelism. The hardware for addition in Jacobians of genus-g hyperelliptic curves 
consists of four register sets: Ureg, Xreg, Yreg, and Zreg. Ureg and Xreg have (g-tl) 
registers for storing the coefficients of a polynomial of degree g, while Yreg and Zreg 
have g registers for storing the coefficients of a polynomial of degree (g- 1 ). A Galois 
field multiplier is placed in each register of Ureg and Yreg and one is also placed in 
the inversion operator in the circuit. The circuit contains a total of ( 4 g-t 2 ) coefficients 
registers, ( 2 g-tl) Galois field multipliers, and one inversion operator (see Fig. 1 ). 

No explanation of the hardware computation of the GCD by hardware is included 
here, because the process is not related to the objective of this paper. We assume that 
Ureg and Xreg store the GCD, d, and Si that satisfies s^Oi + S2CI2 = 1 after the 
computation. 
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Fig. 1. Hardware Configuration 



When a I and a 2 are prime, the algorithm for the addition is as follows 
[Ca87],[SSI98]: 

Algorithm 1 (Addition) 



Input 

Output 



a,, a,, b,, b,, s,, s, 

a' , b' 
a^ — a^^ d-2 

bj =(s^*a^*b 2 + S 2 *a 2 *bJ mod a3 
a^ = (f + b3 + b3') /a3 
a^ = a^/ (leading coefficient of a^) 
b^ = (b 3 + 1) mod a^ 
a' = a,; b' = b,; 
while (deg a^> g) { 
a^ =(f + b, + b/)/a, 
a^ = a^/ (leading coefficient of a^) 
bg = (b^ + 1 ) mod 
a' = a^; b' = bj,- 
a^ = a^; b4 = bj,- 



Step A1 
Step A2 



Step A3 



Return [a' , b' ] 

End 

- Step A1 Since the input polynomials ai and 02 have degree g, and Si and ^2 
have degree ( g -1 ), <23 and bs have degrees 2 g and ( 2 g - 1 ), 
respectively. Furthermore, ( Siaib 2 + ) has degree (3g - 2), 

and therefore step Al takes a total of (13g^ - 12g + 2) field 
multiplications [SSI98]. 
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- Step A2 Oj and bs have degrees 2g and (2g - 1), respeetively. Since the 

degree of/is (2g-i-l), 04 and b 4 have degrees (2g - 2) and (2g - 3). 
Step A2 takes a total of (16g^ - 14g + 3) field multiplications. 

- Step A3 In the first iteration, 04 and b 4 have degrees (2g - 2) and (2g - 3), 

respectively. In the case of a genus 3 curve, the number of iterations 
is 1, and in the case of a genus 6 curve, the number of iterations is 2. 

In Steps A1 and A2, computation of the multiplication and division operations for 
polynomials of degree 2g takes a considerable time. To reduce the amount of 
computation, we introduce the polynomial q(x) = Si (bi + b 2 ) mod 02 . 



Lemma 1. In step A2, using q(x), we can express U 4 as follows: 



04 = Qiq^oi, ai) + Q(f, a^) 



Here, Q(m, v) is a function that gives the quotient of dividing u by v. 

Proof First, we show that bj, = q a\ + b\. Note that s^ai + S\a2 = 1 and that deg ai 02 > 
deg bi- We can compute bs as follows: 

bi = (Siai(bi+ b2 )) mod (fli 02) + b\ = {ii(Z7i + b2) mod 02] fli + b\ ( 1 ) 

= qai + b\ 

Next, since the division for computing 04 has no remainders, and Qibs, a3)=0 from deg 
bi < deg Oi, 

04 = Q(f + bi+ bi, Oi) = Q(f, Oi) + Q(bi^, Oi) (2) 

Substituting ( 1 ) into the second term of ( 2 ) and noting that Q(bi, ai )=0 from deg bi^ < 
deg Oi, we obtain 

Q(Z?3 , Oi) = Q(^ a\ + b\ , a\a2 ) = Q(^ a\ , a\a2 ) = Q(^ a\, 02 ) ( 3 ) 

Combining equations ( 2 ) and ( 3 ), we get 04 = Qiq^ai, 02 ) + Q(f, 03). 

We take the curve C: + y = x^ /GF( 2 ) of genus 3 to show an application of Lemma 

1 . By applying fix) = , we obtain Q(f, Oi) = x + C2 +62- The new Algorithm 2 is as 

follows: 



Algorithm 2 (Addition) 

Input a^, a^, b^, b^, s^, 

Output a, b 

q = s^*(b^+b2) mod a^ Step Al ' 

^4 = Q(q^*a4, aj-i- x + Step A 2 ' 

a4 = a4 / (leading Coefficient of a4) 
b4 = (q*a4 + b4 + 1 ) mod a4 
a' = a4,- b' = b4,- 
while (deg a4> g) { 
a^ = Q (x’ + b4", a4) 



Step A 3 ' 
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= a^/ (leading coefficient of aj 
bg = (b^ + 1 ) mod 
a' = a^; b' = bj,- 
a^ = ag,- b4 = b^; 

} 

Return [a' , b' ] 

End 

Here, ai(x) = + C 2 X^ + CiX + Co and 02 (x)= x^ + 62 x^+ e\X + eo.ln the computation 

of the polynomial a^, The polynomials and bj, do not appear in Algorithm 2. The 
RTL behavior of the hardware for Algorithm 2 is described in the appendix. 

When D\= D 2 , i,e., for doubling, we can compute as follows. 

Algorithm 3 (Doubling) 

Input a^ , b^ 

Output a' , b' 

a^ = a^^ Step D1 

bj = (b^^ + f ) mod aj 

a^ =(f + bj + bj^l/aj Step D2 

a^ = a^/ (leading coefficient of a^) 
b^ = (bj + 1) mod a^ 
a' = a,; b' = b,; 

while (deg a^> g) { Step D3 

a^ = (f + b, + b/) /a, 
a^ = a^/ (Leading Coefficient of aj 
bg = (b^ + 1 ) mod a^ 
a' = a^; b' = bj,- 
a, = a^; b, = b^; 

} 

Return [a' , b' ] 

End 

- Step D1 The polynomials «i and bi have degree g and (g - 1) repectively. 

Therefore, the computation of as and bs takes g^ and (g - 1)^ field 
multiplications, respectively. Step D1 takes a total of (6g^+ 1) field 
multiplications. 

- Step D2 Step D2 is the same as Step A2 and takes (16g^ - 14g + 3) field 

multiplications. 

- Step D3 Step D3 is the same as Step A3. 

In Steps D1 and D2, the computation of multiplication and division operations for 
polynomials of degree 2g takes a considerable time. To reduce the amount of 
computation, we introduce the polynomial q(x) = Q( bs, a \ ). 



Lemma 2. In step D2, we can express U 4 as follows: 

04 = + Q(f, as) 
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Here, Q(m, v) is a function that gives the quotient of dividing u by v. 

Proof Qibj, af) is equal to 0, because deg bj, < deg ay, therefore, 

O-A = 5 tis) + Q(/i ^^ 3 ) (4) 

Let the result of dividing bs by ai be equal to ^ -H (r/ai), where ^ is a quotient 
polynomial and r is a remainder polynomial. Since the characteristic is 2, we get the 
following equation; 

bi^/ai^ = q^ + (ri^/af) (5) 

Combining equations (4) and (5), we get a 4 = q^ + Q(f, af). 

We also take the curve C: + y = /GF(2) of genus 3 to show the application of 

Lemma 2 to Algorithm 3. 



Algorithm 4 (Doubling) 

Input 

Output a' , b' 

_ 2 
9.3 

bj = b^^ + x(a^ - 
q = Q^lbj, aj 
a, = q + Q ( f , aj ) 

a^ = a^/ (leading coefficient of a^) 
b^ = (bj + 1) mod a„ 
a' = a,; b' = b,; 
while (deg a^ > g) { 
a^ = Q {x + hf,aj 

a^ = aj,/ (Leading Coefficient of aj 
bg = (b^ + 1 ) mod a^ 
a' = a^; b' = b^; 
a^ = ag,- b4 = bj,- 



Step Dl' 
Step D2' 



Step D3 ' 



Return [a' , b' ] 

End 

Here, /mod a^= x (ai - x^)^ and bf mod a^ = bf are used in computing bs. From the 
viewpoint of hardware implementation, registers for storing <23 and bs are not really 
needed, since the hardware size of the squaring operator is less than that of the 
register. If the hardware has squaring operations, we can neglect Step DF, that is, the 
computation of polynomials <23 and bs. In addition, since the characteristic of the 
ground field is 2, the computation of q^ requires only squaring operations, and the 
computation of <24 is simplified. 



3.2 Computational Complexity of the New Algorithm 

To investigate the case of hardware implementation, we use two notations, m and M, 
for multiplication. As explained in the previous section, we assume that the hardware 
has (2g + 1) Galois field multipliers and can execute (2g + 1) multiplications 
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simultaneously. We denote the execution on hardware by M, the multiplication itself 
by m, and the inversion by I. We estimate the computational complexity by using m, 
M, and I, and neglect the squaring. The estimation was made by using two Jacobian 
varieties: Jc(GF(2^^)), given by the genus 3 curve C\ + y = £ /GF(2) and 
Jc(GF(2^^)), given by the genus 6 curve O. y^ + y = + x'' + x^ + 1 /GF(2). The 

results are given in Tables 1-4. 



Table 1. Number of field operations (g = 3 addition) 



Computation 


Computation 

time 


Computation 
time (HAV) 


Processing 
time (HAV) 


GCD 


3I-i-23m 


3IH-9M 


3t(I)H-9t(M) 


Step AF 


15m 


4M 


4t(M) 


Step A2’ 


34 


I-i-20m 


1H-6M 


t(I)+ t(M) 


b4 


17m 


5M 


5t(M) 


Step A3’ 


as 


3m 


2M 


2t(M) 


bs 


3m 


M 


t(M) 


Total 


4lH-81m 


4IH-27M 


4t(I)H-22t(M) 



Table 2. Number of field operations (g = 3 doubling) 



Computation 


Computation 

time 


Computation 
time (HAV) 


Processing 
time (HAV) 


Step D2’ 


q 


3m 


2M 


0 


34 


Ih- 2m 


iH- M 


t(I)+ t(M) 


b4 


8m 


2M 


2t(M) 


Step D3’ 


as 


3m 


2M 


2t(M) 


bs 


3m 


M 


t(M) 


Total 


lH-19m 


IH-8M 


t(I)H-6t(M) 



Table 3. Number of field operations (g = 6 addition) 



Computation 


Computation 

time 


Computation 
time (HAV) 


Processing 
time (HAV) 


GCD 


6Ih- 86m 


6Ih- 21M 


6t(I)H-21t(M) 


Step AF 


66m 


6M 


6t(M) 


Step A2’ 


34 


Ih- 85m 


Ih- IIM 


llt(M) 


b4 


Ih- 56m 


6M 


6t(M) 


Step A3’ 


3s 


Ih- 44m 


Ih- 9M 


9t(M) 


bs 


16m 


2M 


2t(M) 


Step A3’ 


36 


Ih- 27m 


Ih- 7M 


t(I)+ t(M) 


bs 


12m 


2M 


2t(M) 


Total 


9lH-392m 


9Ih- 64M 


7t(I)H-58t(M) 
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Table 4. Number of field operations (g = 6 doubling) 



Computation 


Computation 

time 


Computation 
time (HAV) 


Processing 
time (HAV) 


Step D2’ 


q 


15m 


5M 


0 


34 


Ih- 5m 


IH- M 


t(I)+ t(M) 


64 


20m 


2M 


2t(M) 


Step D3’ 


35 


lH-44m 


Ih- 9M 


9t(M) 


bs 


16m 


2M 


2t(M) 


Step D3’ 


^6 


I-i-27m 


Ih- 7M 


t(I)+ t(M) 


be 


12m 


2M 


2t(M) 


Total 


3lH-139m 


3IH-28M 


2t(I)H-17t(M) 



Here, t(M) and t(I) represent the processing times for multiplication and inversion. 
In table 1, it is assumed that t(I) > 5t(M) and that the computation of 04 and its 
inversion can be executed simultaneously. In Table 2, it is assumed that t(I) > 2t(M) 
and that the computation of q and its inversion can be executed simultaneously. 
Similarly, it is assumed that t(I) < 1 lt(M) in table 3 and that t(I) > 5 t(M) in Table 4. 

An efficient implementation of arithmetic in GF(2") is discussed in [ITT86]. An 
inversion in GF(2^®) takes 8 multiplications and an inversion in GF(2^^) takes 6 
multiplications. All the assumptions mentioned above are true when the method 
described in [ITT86] is used. Applying t(I) = 8t(M) or t(I) = 6t(M) to Tables 1-4, we 
can get the following results: 



Table 5. Summary of the number of field operations and time 





Computation time 


Processing time 


g =3 addition 


113m (4 Ih- 81m) 


54t(M) 


g =3 doubling 


27m ( Ih- 19m) 


14t(M) 


g =6 addition 


446m (9lH-392m) 


lOOt(M) 


g =6 doubling 


157m (3lH-139m) 


29t(M) 



The data in table 5 show that the processing time is proportional to the genus, even 
though the computation time is proportional to the square of the genus or a higher 
order. 

In [SSI98], the number of field operations is also estimated in the case of software 
implementation. An addition takes 401 multiplications and a doubling takes 265 
multiplications in the Jacobian Jc(GF(2^^)), given by the genus 3 curve C\ + y = x' 
/GF(2). In comparison, we found that the proposed algorithm is 3.5 (resp. 10) times 
better with respect to the total computation time and 7 (resp. 19) times faster with 
respect to the total processing time in the case of genus 3 addition (resp. doubling). 

Assuming that k has 160 bits and that the number of doublings is equal to 160 and 
the number of additions is equal to 80, Table 5 gives the following results for the time 
taken to compute kD in Jacobians: 
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Table 6. Average processing time to compute kD 



Operating 

frequency 


Number of clocks for multiplication 


Case A 

t(M) = 8clocks 


Case B 
t(M) = Iclock 


g=3 


20 MHz 


2.624 ms 


0.328 ms 


g=3 


40 MHz 


1.312 ms 


0.164 ms 


g=3 


80 MHz 


0.656 ms 


0.082 ms 


g=6 


20 MHz 


5,056 ms 


0.632 ms 


g=6 


40 MHz 


2.528 ms 


0.316 ms 


g=6 


80 MHz 


1.264 ms 


0.158 ms 



Here, to estimate the time in the case of hardware implementation, two parameters 
are introduced: (1) the operating frequeney of the hardware, and (2) the number of 
clocks for a multiplication operation. 



4. Hardware Implementation 



4. 1 Hardware Efficiency 

Here, we estimate the maximum operating frequency and the size of the hardware. 
Table 6 shows two cases for eomparison: (1) g = 3 Case A multiplication and (2) g = 
6 Case B multiplication. As is well known, the advantage of arithmetic in GF(2‘') are 
as follows: 

1 . The multiplication and addition can be executed by a small amount of hardware. 

2. The 2" power operation (n=l,2, . . .) is very simple. When normal bases are used, 
only the bit shift operation is required; even if polynomial bases are used, the 
hardware size is still small. 

Thus, no custom design methodology is needed for the Galois field arithmetic 
unit; it is sufficient to design the unit by using gate array methodology. We designed 
the hardware in VHDL and performed simulation using IBM’s Booledozer [BLD] and 
Model Teehnology’s ModelSim [MTS]. Cireuits were synthesized by using IBM 
CMOS 5SE gate array technology with an effective channel length Leff = 0.27-um 
[CDB]. There are no optimal bases[MOV89] in GF(2^^) and GF(2 ), and we take p{x) 
= + x‘* + + X + 1 or p(x) = x^‘^ + x^ +1, respectively, as the primitive 

polynomials of the Galois field [LN87]. 

The results of design and synthesis are given in the table below. Here, we force the 
constraint that the delay on the critical path between registers is at most 12 ns; that is, 
the maximum operating frequency is 83 MHz. 
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Table 7. Size of hardware 



Block 


Size (cells) 




g = 3 Case A 


g = 6 Case B 


Multiplier 


34265 


[7] 


66196 


[13] 


Squaring Operator 


1344 


[3] 


495 


[11] 


Inversion Operator 


27414 


[1] 


8580 


[1] 


Register 


18408 


[59bit x26] 


17400 


[29bit x50] 


Control 


9749 




7395 




Selector for register 


37140 




53939 




Selector for operators 


17402 




16851 




Total 


145722 




170856 




(After optimization) 


140647 




165743 





The numbers in square brackets represent Galois field arithmetic units. Note that 
the sizes of 1-bit latch, 2-way XOR and 2-way NAND are 12 cells, 3 cells, and 2 
cells, respectively. 

The register block has 4g registers for storing the input polynomials, and consists 
of (8g + 2) coefficient registers. Note that the total numbers of register bits are nearly 
equal in both cases. 

The cell size of one multiplier is 4895 in g = 3/Case A and 5092 in g = 6 /case B. 
The numbers are almost the same, but the implementation is different. In case A, the 
multiplier has registers and computes every one-byte input by using a division 
circuit(digit serial). On the other hand, the multiplier in Case B has no registers(bit 
parallel). The cell size of the multiplier in GF(2^^) is about 20K cells for t(M) = 1 in 
the case of genus 3. Since there are a total of 8 multipliers in the case of multipliers in 
the case of genus 3 curves, the total cell size will be about 250K cells if the multiplier 
of case B (t(M) = 1) is used. The data given in tables 6 and 7 show that the total size 
for g = 6/Case B is about 25K cells larger than that for g = 3/Case A, but that the 
computation time is four times shorter. 

[En99] gives a formula for estimating the number of multiplication and inversion 
operations for addition and doubling. In the case where g <10, the formula uses 
Gauss reduction, and the degree of the leading term is 3. In the case where g > 12, it 
uses Legendre reduction, and the degree of the leading term is 2. But since the 
coefficient of degree 3 is small, the term of degree 2 is dominant. This can be easily 
seen in Tables 1-4. The number of multiplication operations can therefore be 
approximated by o(g^). 

We analyze the dependency of the hardware efficiency on the genus of the curves. 
Here, for simplicity, we consider only the Galois field multiplier and ignore other 
parts of the circuits. We define the efficiency as the result of dividing the speed by 
(hardware size)* (power consumption). Let n be the extension degree of the ground 
field over which a curve is defined, n is proportional to 1/g. Since the power 
consumption of one multiplier is proportional to and the number of multiplication is 
proportional to g^ , the total power consumption does not depend on g. Since the delay 
of the multiplier of case B is proportional to log 2 n , the computation time is 
proportional to g^ log 2 n without any parallel operation. 
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As a result, the efficiency is proportional to 1/ log 2 [C/g], where C is a constant and 
C/g is n. This function increases slowly for small g( > 0), and therefore, the efficiency 
does not vary considerably with g. The data given in Table 7 agrees well with this 
result. The size of the multiplier in the case of g = 3/Case B is estimated to be about 
160 K cells and the size of the multiplier in g = 6/Case B is 70K cells. The 
computation time in g = 6/Case B is twice that in g = 3/Case B. Since it can be 
considered that the total power consumption in both cases is the same, we can get 
almost the same values for the efficiency. But it is impossible to design a multiplier 
for which the product of the size and speed is constant at any speed. For example, the 
size of the multiplier in case of g=3/Case A is 1/4, but the size is 1/8 of that in g = 
3/Case B. The efficiency in g = 6/Case B is 2.3 times higher than that in g = 3/Case A 
if we consider only the multiplier. 



4.2 Comparison of the Efficiency of Hyperelliptic Cryptosystems and Others 

Here, we compare the efficiency of hyperelliptic cryptosystems with that of other 
cryptosystems. The results of hardware implementation of RSA have been reported in 
[HTAA90], [IMI92], and [SKNOM97]. The results of 512-bit-key RSA are 
summarized in the table below. 



Table 8. Performance of RSA hardware 





Size 


Computation time 
@20MHz 


[HTAAA90] 


1050Kbit RAM -t 305K gates 


2.0ms 


[IMI92] 


198K gates 


2.5ms 


[SKNOM97] 


- 


14.0ms 



We want to compare by using 1024-bit-key RSA, which has the same level of 
security as a 160-bit-key ECC(elliptic curve cryptosystem). But it is not described in 
[HTAA90] and [IMI92]. Moreover, [SKNOM97] can not be used directly, because its 
arithmetic unit was designed by using a custom design methodology. Therefore, we 
assume that the computation time is proportional to the square of the key length, and 
multiply the time in Table 8 by 4 for 1024-bit-key RSA. The data given in Table 6, 7, 
and 8 show that the speed of hyperelliptic curve cryptosystems is 3 (resp. 12.6) times 
faster than that of RSA and that the size is smaller in g = 3/case A (resp. g=6/case B). 

Next, we consider a computation with elliptic curve cryptosystems. Since the 
computation of ECCs is more complicated that that of RSA, a co-processor approach 
that gives the basic Galois field arithmetic is often used for the implementation 
[AVN93]. In [TOH98], it is reported that an average processing time of 32 ms is 
needed for hardware with 22K gates to compute kD at an operating frequency of 20 
MHz. In comparison with [TOH98], hyperelliptic curve cryptosystems defined over 
Jacobians of genus 3 or 6 are about 50 or 100 times faster when the multiplier of case 
B is used at a frequency of 20MHz, though the sizes are 8 or 12 times larger. 

The numbers of multiplication and inversion operations have been well studied in 
elliptic curve cryptosystems. IEEE PI 363 [PI 363] proposes Jacobians coordinates. In 
binary case, the numbers of multiplication needed to compute projective elliptic 
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addition and doubling are 15 and 5, respectively. Assuming that the hardware of a 
160-bit-key elliptic curve cryptosystem has registers and only one multiplier, which 
takes one clock cycle for computation using the normal bases, the size of the 
hardware will exceed to 270K cells, because the size of the multiplier is larger than 
256K cells and the size of four input and three working registers is at least 14K cells. 

In comparison with g = 3/Case B, the elliptic curve cryptosystem is larger but 
about 3.3 times faster, because the number of multiplications for addition in the case 
of hyperelliptic cryptosystem is 54t(M), while that in the case of an ECC is 15t(M). 
In [SSI98], the performance of hyperelliptic curve cryptosystems is compared with 
that of RSA and elliptic curve cryptosystems from the viewpoint of software 
implementation and it is shown that the efficiency of hyperelliptic curve 
cryptosystems is better than that of RSA but worse than that of elliptic curve 
cryptosystems. This agrees well with our analysis of hardware implementation. 



5. Concluding Remarks 

The discussion and results of hardware implementation presented in the previous 
section show that the hardware efficiency is almost the same for curves of small 
genus, and that it is possible to design the hardware by using the proposed algorithm 
without assuming unreasonable conditions as regards as the current semiconductor 
technology. In other words, we can choose a curve of any genus to realize the same 
level of security by hardware means with the same effort, as long as the genus of the 
curve is less than 5. This offers an inducement for the use of hyperelliptic curve 
cryptosystems. 
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Abstract. Designing security of wide-area distributed systems is a 
highly complicated task. The complexity of underlying distribution and 
replication infrastructures together with the diversity of application sce- 
narios increases the number of security requirements that must be ad- 
dressed. High assurance requires the security enforcement to be isolated 
from non-security relevant functions and limited in the size of imple- 
mentation. The major challenge in the is to find a balance between the 
diversity of security requirements and the need for high assurance. This 
paper addresses this conflict using Globe system as a reference frame- 
work, and establishes a security design that provides a flexible means of 
addressing the variety of security requirements of different application 
domains. 



1 Introduction 

Security design refers to the interfaces and services that must be incorporated 
into the system to enable addressing of different security requirements [5]. The 
security design must be such that it enables verification and validation of the se- 
curity enforcement to achieve high assurance. Assurance refers to the confidence 
that the security enforcement is appropriate. 

A number of generic considerations must be addressed by the security de- 
sign to achieve high assurance. For example, the amount of trusted code should 
be kept to a minimum, duplicate security functions should be eliminated, the 
trusted code should be designed to enable verification and validation, and the 
software should be designed to enable code optimization for different hardware 
platforms [1]. 

Addressing these considerations leads towards a trusted computing base ca- 
pable of verifiably enforcing a small number of security requirements. Diversity of 
security requirements is often a prohibitive factor for high assurance [8] . However, 
security requirements of distributed object systems are of a very high diversity. 
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Not only secure communication requirements, e.g. encryption and authentica- 
tion, but also requirements of secure operating systems, e.g. secure method or 
function execution and access control, and those of secure communication sys- 
tems, e.g. traffic filtering and client behavior monitoring for intrusion and misuse 
detection, must be addressed. 

Designing security on development platforms for distributed shared objects 
(DSO), such as Globe [15], that provide transparent distribution and replication 
of objects over multiple physical locations further complicates the security de- 
sign. The major advantage of Globe type of systems over, say, GORBA is the 
scalability. GORBA, DGOM and other existing distributed object technologies 
assume a static replication model, whereas Globe enables per-object replication 
strategies allowing an increased flexibility in designing global distributed object 
systems [3]. 

Security design must also be established in a manner that allows object- 
specific security policies being established and maintained without limiting the 
range of applications [16]. 

A number of security architectures, such as Kerberos, Sesame, and DSSE, 
have been established for distributed and networked systems [9,12]. Such ar- 
chitectures are mostly concerned with the development of secure applications 
on networked and distributed environments. They do not address the security of 
the inherently complicated distribution infrastructure itself. Scalability to global 
systems is also questionable. 

This paper examines the problems associated with security designs of global 
DSO architectures in general, and Globe in particular. We begin by identifying 
the security requirements of distributed object systems and proceed by examin- 
ing the challenges that the security designer faces when attempting to address 
these requirements. This is followed by a comparison of two possible security de- 
signs. The implementation aspects of the chosen design shall then be discussed. 
Finally, conclusions shall be drawn and directions highlighted for future work. 

2 Security Requirements in Distributed Object Systems 

Security requirements of distributed systems must be addressed through commu- 
nication security, operating system security, and network security requirements. 
There are also object life-time requirements dealing with creation, binding and 
disposal of objects. Security management must be addressed, as well as ed- 
ucational and operational security requirements, and other pervasive security 
requirements. 

In the following, examples of security requirements are given at each category. 
The list is comprehensive, yet it is questionable whether it can ever be complete. 
Also, the identified requirements are partially overlapping and each one may 
contribute to more than one security objective. 

For example, message semantics based filtering of protocol messages is a 
means of achieving access control. If the traffic filtering is enforced at the appli- 
cation level, there is a close relationship to the access control based on method 
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execution request. This is to be addressed at the operating system level. How- 
ever, in most cases the traffic filtering is applied at lower levels, for example at 
the network layer, when it must be considered as a means of network security 
instead of a means of operating systems security. Both ways ways still contribute 
towards access control 

2.1 Communication Security Requirements 

Communication security is mostly concerned with cryptographic techniques to 
achieve confidentiality, integrity, authenticity, and non-repudiation of communi- 
cated messages. The communication security requirements of distributed object 
systems are not fundamentally different from general communication security 
requirements, for example those of the ISO OSI standard [9] . 

Depending on the type of communication, confidentiality can be addressed 
through Connection-oriented confidentiality requirement or Connec- 
tionless confidentiality requirement. Not all the protocol fields may require 
equal security, and it may become appropriate to address the Selective field 
confidentiality requirement. In some environment, even the fact that com- 
munication between certain hosts occurs may be sensitive, and the Traffic flow 
confidentiality requirement must be addressed. 

Integrity measures may be implemented with or without recovery. Those sup- 
porting recovery allow the reconstruction of the message from the integrity check, 
whereas those without recovery can only be used for verifying the correctness of 
pairs of messages and integrity checks. As integrity can be addressed by entire 
protocol messages or selective fields. Message integrity requirement with 
recovery. Message integrity requirement without recovery. Selective 
field integrity requirement with recovery, and Selective field integrity 
requirement without recovery must be addressed. 

Authentication can be applied either to a peer object or to the origin of 
data. Additional measures can be provided for client and user authentication. 
However, they should not be addressed at the technical infrastructure of object 
distribution. Peer object authentication requirement must be addressed 
when data from a communicating software module, such as a protocol stack 
implementation, must be authenticated to the peer object in communications. 
Closely related is the Peer object integrity requirement where assurance 
must be provided to a peer object in a communicating system of the peer object 
implementation not being altered by, for example, a trojan horse. Data origin 
authentication requirement addresses the authentication of the communi- 
cating hosts as sources of protocol messages. 

Non-repudiation of origin requirement addresses concerns of a sender 
of a message repudiation participation in communication. Non-repudiation of 
receipt requirement provides means to verify that a particular recipient has in 
fact received a message. Non-repudiation of delivery requirement is more 
complicated. Usually, it can not be addressed at the applications level. 
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2.2 Operating Systems Security Requirements 

Traditionally, operating system security has been concerned with access con- 
trol to protect files from unauthorized reading or modification, or to prevent 
unauthorized execution of system administration software. In distributed object 
systems, object can refer to objects of any granularity. Therefore, Object in- 
tegrity requirement and Object confidentiality requirement can refer to 
any operating system object, or to the distributed shared object as a whole. Ob- 
ject level access control requirement refers to the measures to determine 
which accesses are allowed within the DSO. 

From the DSO point of view, object confidentiality, integrity, and access 
control requirements refer to the state of the entire distributed shared object 
remains confidential or unaltered. Access control requirements deal with which 
clients are allowed to bind to the DSO. At such a coarse granularity, security 
issues must be addressed through object life-time security measures, addressing 
them at the operating system level is hard. 

With the emerge of new networking technologies and new programming lan- 
guages, the scope of operating systems security extends into more active control 
of, for example, program execution. A typical example of extended operating 
system security functionality is the Java Virtual Machine (JVM). Such require- 
ments can be addressed through Secure method execution requirements. 

In addition to addressing the communication security requirements to pre- 
vent method invocations from remote hosts being tampered with, the Method 
integrity requirement must be addressed to assure remote clients with the 
correctness of methods stored and executed in remote hosts. This is different 
from the peer object integrity requirement in a sense that it addresses the actual 
methods provided by the DSO, not the integrity of the methods of replication 
and distribution infrastructure. 

2.3 Network Security Requirements 

Network security is addressed to provide assurance of the correct operating of 
the distributed system that employs the above security technologies. The firewall 
capacities are provided by the Message semantics filtering requirement 
that addresses the selective transmission of protocol messages to block unwanted 
protocol messages from being processed by the distributed object. 

At the distributed object level this means selective processing of control 
messages and remote method invocations. At the underlying communication 
infrastructure, this means selective forwarding of suspicious datagrams. 

Client behavior monitoring requirement and Method execution se- 
quence monitoring are means to detect intrusions and misuse and trigger 
appropriate alarms. 

2.4 Object Life-Time Security Requirements 

There are a number of security requirements that must be addressed through the 
object life-time instead of during the operation of the object. Most importantly, 
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secure binding and secure disposal of the components of a DSO. When a new 
client wishes to connect to a DSO, it must initiate the binding procedure. 

The first step in binding is that the candidate client contacts a name server 
to request the unique object identity that matches the symbolic name of a DSO 
where the connection is to be established. The name service must be protected 
by addressing the secure name service requirements. 

Once in the possession of the unique object identity, the new client proceeds 
with the binding by contacting the location server. The location server returns 
the contact address of the DSO as a pair of network address and port to connect 
to. This phase must be protected by addressing the secure location service 
requirements. 

In the following step, the new client contacts the implementation repository 
where the program code is loaded to construct the local representative of the 
DSO in a local address space. This step must be protected by the secure imple- 
mentation repository requirement. In fact, there are a number of open issues 
in the security of downloading executable content that must also be addressed 
in case of non-local implementation repositories. 

With the implementation of the local representative, the client can proceed 
with the binding to the DSO using the newly created local representative and 
the contact address received from the location service. Secure connection es- 
tablishment requirement must address the issues related to the establishment 
of the connection between the local representative and the DSO. This is also 
the phase where end-user security requirements, such as user authentication 
must be addressed. 

At the end of the life time, the local representative disconnects from the 
DSO. This includes disposal of the local code as well as the state of the DSO 
and the security state of the communication. This must be addressed through 
the secure object disposal requirement. 



2.5 Pervasive Security Requirements 

Pervasive security requirements can be studied from a number of points of view. 
One point of view is the security requirements related to the management of 
information security. These requirements include, for example, security planning 
and security maintenance. Also, a number of security education and awareness 
requirements must be addressed. However, as these bear no significance to the 
establishment of a security design for DSO systems, they shall not be further 
studied herein. 

Other pervasive security requirements are those that are needed for prop- 
erly implementing specific security requirements. For example, security labels, 
trusted implementation of security functions and so further must be addressed 
once implementing the security design in a particular application scenario. 

Finally, there are a number of implementation requirements that have se- 
curity relevance, even though they are not addressed through actual security 
measures. A typical example is the total ordering of method invocations to pre- 
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Fig. 1. Implementation of a Globe local object 



vent inconsistent states of a DSO by incorrect method execution sequences. A 
comprehensive treatment is provided, for example, by Birman [4]. 

These are often considered to be issues addressed by the reliability engineer- 
ing and dependable computing point of view. They are more concerned with 
continuity and correctness of services in the presence of random and independent 
failures, not in the presence of active attacks, i.e. selective failures. Therefore, 
they shall not be made a part of the security design. 

3 Challenges of Security Design 

To examine the challenges of security design in wide area distributed systems, 
the Globe object architecture will be used in this paper as a concrete example. 
After the introduction of the Globe object architecture, three security design 
challenges shall be addressed: selection of the security design model, coping with 
the diversity of security requirements, and the placement of security measures 
within the object architecture. 



3.1 Globe Object Architecture 

A central construct in the Globe architecture^ is a distributed object. A dis- 
tributed object is built from a number of local objects that reside in a single 
address space and communicate with local objects residing in other address 
spaces. 

Together, the local objects form the implementation of a particular DSO. 
Local objects consist of the actual interface and the distribution mechanism, as 
illustrated in Fig. 1. The distribution mechanism enables transparent distribu- 
tion and replication of objects, hiding details from application developers. 



^ More details available at http://www.cs.vu.nl/globe/ 
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The semantics subobject contains the methods for the functionality of the 
DSO. This is the only subobject the application developer must develop himself. 
It is as objects in middleware architectures such as DCOM and CORBA. 

The communication subobject is responsible for the communication between 
local objects residing at different address spaces. It implements a standard inter- 
face but can have several implementations depending on the particular commu- 
nication needs and provides a platform-independent abstraction of underlying 
networks and operating systems providing the communication services. 

The replication subobject replicates and caches the local objects and con- 
structs the DSO from local objects. It also implements coherence protocols to 
decide when methods of the local semantics subobject can be invoked without 
violating the consistency policy. 

The control subobject invokes the semantics subobject’s methods. It also 
marshalls and unmarshalls invocation requests passed between itself and the 
replication subobject. 



3.2 Security Design Model 

Several security design models have been established over years. Most of them, 
however, focus on multilevel secure systems and databases (e.g. [1,6,14]) instead 
on the security of conventional applications on general purpose operating sys- 
tems. High dependence on risk analysis further limits the applicability of many 
models (e.g. [2,13]). 

Limitations of risk analysis become evident in the design of the security of 
system development and distribution platforms. Since underlying implementa- 
tion technologies and operational environment are not known at the time of 
security design, neither threats nor losses can be estimated. However, a system- 
atic approach, such as [10], is required for guiding the security design. 

The major advantages of [10] is that it reduces risk analysis into a decision 
making tool and is heavily based on the Common Criteria for security evalua- 
tion [7]. It divides security development into three stages. First one deals with 
the specification of all relevant security functions capable of satisfying a certain 
security objective. The second stage aims at selecting a subset of all possible se- 
curity requirements to be implemented in a particular system. Selected measures 
are implemented and evaluated at the third stage. 

This paper deals with the first stage of the model. A possible set of security 
requirements of a distributed object system are identified and a security design 
is established to aid in the implementation of security measures to address those 
requirements. When designing applications using Globe, the particular operating 
system and communication mechanisms can be selected, and a subset of possible 
security countermeasures implemented to match the specific application level 
security policy. 
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Table 1. Possible security requirements of Globe as divided to the underlying 
communication infrastructure (UCI), communication subobject (CoS), replica- 
tion subobject (RS), control subobject (CS), application (AL), and operating 
system (OS) 



Requirement UCI CoS RS CS AL OS 



Connection-oriented confidentiality requirement 


X 


X 


X 


X 


X 


X 


Connectionless confidentiality requirement 


X 


X 


X 


X 


X 


X 


Selective field confidentiality requirement 


X 


X 


X 


X 


X 


X 


Traffic flow confidentiality requirement 


X 


X 


X 


X 


X 


X 


Message integrity requirement with recovery 


X 


X 


X 


X 


X 


X 


Message integrity requirement without recovery 


X 


X 


X 


X 


X 


X 


Selective field integrity requirement with recovery 


X 


X 


X 


X 


X 


X 


Selective field integrity requirement without recovery 


X 


X 


X 


X 


X 


X 


Peer-object integrity requirement 


X 


X 


X 


X 


X 


X 


Message semantics filtering requirement 


X 


X 


X 


X 


X 


X 


Peer entity authentication requirement 


X 


X 


X 


X 


X 


X 


Data origin authentication requirement 


X 


X 


X 


X 


X 


X 


Non-repudiation of origin requirement 


X 


X 


X 


X 


X 


X 


Non-repudiation of receipt requirement 


X 


X 


X 


X 


X 


X 


Non-repudiation of delivery requirement 


X 












Object level access control requirement 






X 


X 


X 


X 


Client behavior monitoring requirement 








X 


X 


X 


Method execution sequence monitoring requirement 








X 


X 


X 


Method integrity requirement 






X 




X 


X 


Object integrity requirement 






X 




X 


X 


Object confidentiality requirement 






X 




X 


X 


Secure method execution requirement 








X 




X 



3.3 Diversity of Security Requirements 

The ideal case of security in distributed object systems is a dedicated security 
subobject that implements security measures similarly to a traditional reference 
monitor. However, it is not obvious how this can be achieved in practice taking 
into account the high diversity of security requirements and the possibility of 
security requirement being addressed at different subobjects. The possible com- 
ponents where different security requirements of distributed object systems can 
be addressed within the Globe object architecture are illustrated in Table 1. 

The semantics subobject does not have security relevance to the Globe archi- 
tecture since it does not participate in replication and distribution. However, a 
high number of security requirements may be addressed at the application level 
through the semantics subobject. 

The security architecture is also independent of the underlying communi- 
cation architecture. A TGP/IP network could implement packet confidentiality 
and authenticity in form of IP SEG standard, or a transport layer security by 
SSL or SSH. In more advanced scenarios, network traffic could be authenticated 
and access control provided by Kerberos, Sesame, or DSSA/SPX. Since the ob- 
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jective of Globe is flexibility and platform independence, no assumptions of the 
available security services can be made. All communication security measures 
may need to be implemented at the communication subobject. 

The communication subobject security is concerned with secure communica- 
tion channels between local objects. Requirements are those of secure commu- 
nication, i.e. confidentiality, integrity, authenticity, and non repudiation. Access 
control can be enforced through traffic filtering based on protocol messages. 

In group communication, communication security measures can be applied at 
the replication subobject or control subobject on per-message rather than per- 
recipient basis. If the communication subobject manages group communication 
through a number of point-to-point channels, this can significantly reduce the 
cryptographic overhead. 

The replication subobject security is also concerned with the enforcement of 
secure replication of objects, and prevention of malicious parties from altering 
the DSO state, interface or implementation. 

The control subobject and the underlying operating system are responsible of 
secure execution of the methods of the semantics subobject. Method level access 
control can be provided to decide which methods can, under which constraints, 
be invoked in the local environment. Client behavior and method execution 
sequences can be monitored for intrusion and misuse detection. 

3.4 Placement of Security Functionality 

Consider a simple entity authentication protocol [11, p.402]. B initiates the pro- 
tocol by sending a random value rs to A. A replies with a random number 
and a keyed hash hK{rA,rB,B). B then sends the value hK{rB,rA, A) to A 
allowing both parties to verify each other’s authenticity through knowledge of 
key K shared by A and B. 





- B : rB 




(1) 


A- 


B :rA, hxirA, 


rB,B) 


(2) 


A^ 


- B : hK{rB,rA, 


.A) 


(3) 



This (or similar) protocol is likely to be implemented at several subobjects re- 
quiring peer-object authentication. An obvious design is to separate calculation 
of the hash value from the protocol execution logic. Protocol implementation be- 
comes easier as the hash function can be treated as a black box and implemented 
separately, possibly in hardware. 

Availability of a mutually agreed upon hash-function between entities is re- 
quired in steps (2) and (3). To negotiate the function and to store security state, 
such as cryptographic keys, a security context must be maintained by the com- 
munication parties. 

The protocol logic can be implemented as a separate function or as a, say, 
Java object (not a Globe object, though), that is called or instantiated by objects 
requiring to authenticate with their peer-objects. The subobject instantiates the 
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authentication object and defines the parameters, such as the behavior in error 
conditions. This complicates the object interface but provides high encapsulation 
of security relevant processing in a dedicated authentication protocol object. 

Not all protocol errors imply an authentication failure. They may be due 
to network congestion, excessive workload at the peer object, or some other 
random condition occuring. The recovery logic must decide which action to take, 
whether to deal with the peer-object as un-authentic, to proceed with the service 
request and assume further authentication at other subobjects, to block the 
service request and retry authentication after a delay, or to take some other 
action. 

Parameterization of all possible error conditions at different subobjects leads 
to a complex exception handler or to an increased interaction between the secu- 
rity subobject and conventional subobjects. This reduces functional cohesion of 
the security subobject and leads to weaker encapsulation. 

An attempt to isolate the security functionality has, therefore, led to an 
increased complexity of the protocol object. Each authentication request must 
be related to a number of other security-relevant objects, such as security con- 
text and exception handler. Proper software engineering practice, such as thin 
interfaces and functional encapsulation can, however, improve the design. 

Each subobject has to be given a unique identity, expressed as the identity of a 
local object and the particular subobject. This identity is used for authentication. 
The protocol execution logic can be easily separated from the subobject but 
the semantics of different protocol messages has to be bound to a particular 
subobject. This encourages implementation of the authentication protocol as 
part of the conventional subobject as it is mostly aware of various subobject- 
specific semantic conventions. 

The denial of service aspects should also be kept in mind. Globe objects 
are typically distributed using public networks, e.g. the Internet, with limited 
quality of service guarantees. For example, TCP guarantees an ordered delivery 
of messages but not the maximum time for message transmission. Protocols that 
depend on a TCP connection between two hosts may cause serious performance 
penalties due to network congestion outside the control of any local object. 

Resource allocation policies may be defined for protocol steps or stateless 
protocols designed to prevent denial of service. However, protocols have to be 
carefully designed and evaluated for optimal performance and reaction on excep- 
tional conditions. Therefore, they should be dealt with as independent software 
artifacts. 

Protocol implementations are also likely to require a preparedness plan in 
terms of exception handling to recover from situations where a critical resource, 
such as protocol execution time, exceeds a threshold. Recovery is very subobject 
specific and difficult to generalize into a common protocol implementation. 

These issues complicate the decision of whether the security should be man- 
aged by a dedicated security subobject, or by each conventional subobject inde- 
pendently. The following section provides a detailed analysis and evaluation of 
the two alternatives. 
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Fig. 2. A Generic DSO security design 



4 The Security Design 

A generic security design for a DSO, as illustrated in Fig. 2, consists of a se- 
curity subobject, security policy and a number of security associations (SA). 
SAs describe the security state of communication channels and may be shared 
by multiple parties. They contain, at least, encryption and authentication keys, 
modes of algorithms, and other parameters such as initialization vectors, and 
the SA life time. 

Prior to secure communication, peer objects must establish a SA through 
on line or off line negotiations. The initial state of the security association is 
downloaded during the binding. The number of security associations maintained 
by a local object may be different in different implementations and application 
environments. 

The SA must be supported by a security subobject that contains the imple- 
mentation of corresponding security measures. Security subobject implements a 
certain communication security policy. Communication security policy is fairly 
static but security associations may change dynamically. Additional security 
policies are required for access control and intrusion detection. 

Replacing a communication security policy means binding to a different secu- 
rity subobject. Full policy-mechanism -independence, as in access control mod- 
els, is hard to achieve due to the difficulties of formally expressing communication 
security requirements. The local object may initially contain an implementation 
of a number of security subobjects or download them when necessary. 

There are two main alternatives for coordinating the security enforcement: 



Centralized security coordination (CSC) where conventional subobjects 
request all security measures to be executed by the security subobject. 

Distributed security coordination (DSC) where conventional subobjects 
execute the security measure but depend on the security subobject for crit- 
ical functions, such as encryption and decryption of buffers. 
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In CSC, the communication security policy is followed by the security sub- 
object that enforces the policy and executes the required security measures. In 
DSC, the communication security policy is followed by conventional subobjects. 

In the following, the two shall be compared and evaluated against a number 
of security design criteria. The comparison is followed by a discussion about the 
security association and different security policies. A clear distinction between 
the two is impossible in practical systems. The implemented system is likely to 
be a hybrid. However, the comparison suggests that implemented systems should 
bear more characteristics of centralized than distributed security coordination. 



4.1 Centralized Security Coordination 

Communication between local objects is mediated by the security subobject. 
The security subobject maintains the security associations and negotiates their 
content with the security subobject of the peer local object. Certain security pro- 
cessing of messages is carried out at each passing of messages between different 
subobjects. 

Prior to passing a protocol message to a lower or upper level in the subobject 
hierarchy, a subobject passes it to the security subobject. The security subobject 
applies the security measures to protocol messages and returns. The subobject 
that called the security subobject passes the security enhanced protocol message 
to the next subobject. The same process is repeated at each subobject and 
reversed at the receiving local object. 

As a minimum, the security subobject only requires an interface for passing 
and receiving messages to and from subobjects. Each subobject interfaces with 
the security subobject through a common interface that may have different im- 
plementations. The calling sequence from subobjects to the security subobject 
can be standardized. This allows replacement of the security subobject without 
modifying other subobjects. 

Full isolation of the security subobject is hard to achieve, mostly due to 
exception handling. Which action is taken if a security measure, e.g. data origin 
authentication, fails? The interface can be standardized to return a number of 
status codes the conventional subobject can use for examining the status of 
the security processing. For example, a standardized security exception could 
be thrown by the security subobject and caught by the conventional subobject. 
Implementations on languages such as C may return a standardized error code. 



4.2 Distributed Security Coordination 

The security subobject only provides basic security mechanisms to aid subob- 
jects in the enforcement of security as part of the subobject functionality. The 
conventional subobject maintains the SAs and implements the security logic. 

The security subobject provides basic security services, such as encryption 
and decryption of buffers, generation and verification of authentication codes 
of buffers and so on. The conventional subobject will call these measure when 
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Table 2. Comparison of security subobject designs 



Criteria 


CSC design 


DSC design 


Economy of mechanism 


Average 


Average 


Duplicate functions 


Good 


Poor 


Optimization for new hardware 


Good 


Good 


Complete mediation 


Good 


Poor 


Least privilege 


Good 


Poor 


Future alterations 


Good 


Average 


Ease of SA maintenance 


Good 


Poor 



necessary according to the security logic. The receiving conventional subobject 
implements corresponding security protocols, and executes the actions of a re- 
ceiving entity of the protocol. 

This approach introduces a classical trade-off between assurance and diver- 
sity of security requirements. High assurance can be provided for, say, crypto- 
graphic processing on this scheme with the cost of reducing the security subob- 
ject functionality and complicating the security implementation. In CSC, equally 
high assurance can be achieved on cryptographic modules, but in general, higher 
assurance can be achieved to the general security processing. 

The complexity can be reduced by standardized security libraries and proper 
software engineering practices. Replacing and extending the security subobject 
remains easy. Exception handling is logically connected to the protocol execu- 
tion. Adapting to a different communication security policy remains complicated, 
though, alterations are required to each conventional subobject. 



4.3 Comparison of Designs 

Many principles of security design (e.g. [1,14]) focus on access control models and 
their applications. In the following, the above design alternatives are evaluated 
against security design principles adapted to the DSO context. Findings are 
summarized in Table 2. 

Economy of mechanism refers to the small size of implementation and sim- 
plicity of the design to enable appropriate testing and evaluation. It is unlikely 
that either design can achieve such economy of mechanisms to enable formal 
verification of security. Yet, this would be unnecessary in most application sce- 
narios, due to general purpose operating systems used. There are no significant 
differences between the two designs. 

Elimination of duplicate functionality requires that no security func- 
tionality should be implemented in multiple modules. This is a significant disad- 
vantage of the DSC design. Many of subobject’s security protocols are likely to 
be similar and must be implemented by each subobject even though most cen- 
tral security functions are implemented by the security subobject. In the CSC 
design, each protocol is implemented once. This improves the control of the 
implementation but increases the complexity of security subobject’s interface. 
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Code optimization for new hardware is essential for performance rea- 
sons. It is likely that only cryptographic functions are implemented in hardware. 
In both designs, the cryptographic functions can be separated from protocols by 
proper software engineering or by logical separation of security protocols from se- 
curity functions. Both designs provide a considerably good support for hardware 
implementations of different security functions. 

Complete mediation requires the security subobject being consulted in 
each method invocation. System design should prevent subobjects from by- 
passing security on discretion. Complete mediation can, through design, be 
achieved by the CSC design. The security subobject methods are always called, 
even though no security measures are applied (i.e. some security functions are 
NULLs). With DSC, control measures and security method integrity checks have 
to be applied at multiple locations. 

Least privilege refers to the components of the system gaining only a min- 
imum set of accesses to sensitive data required for completing their tasks. The 
DSC design is problematic because of the distribution of security related pro- 
cessing to every subobject of the system. Each subobject must be given access 
to security critical components, such as security associations. The CSC design 
enables easier control of the privileges. 

Ease of future alterations measures adaptation to different security poli- 
cies. As security requires continuous maintenance, this is an important criteria. 
The CSC design is superior, mostly due to the single point of alterations re- 
quired. In DSC design, each alteration in protocols of security functions must 
be implemented in each subobject. However, proper software engineering can 
simplify alterations. 

The comparison suggests the superiority of the CSC design over the DSC 
design. However, the comparison only deals with the design criteria, not on per- 
formance issues. It is not clear how significant performance reduction is caused 
by security processing relative to, for example, network latencies when the local 
objects are distributed over wide area networks. Intuitively, it appears that per- 
formance penalties of different designs are not significantly different relative to 
the overall cost of communication. Real measurements are required to confirm 
the intuition. 

The CSC design as has some additional advantages over the conventional 
subobject enforced security in, for example, ease of SA maintenance as dis- 
cussed in the following section. 



4.4 Security Association 

Each local object must share at least one SA with local objects it is communi- 
cating with. In the CSC design, the security associations are maintained by the 
security subobject. 

SA does not have any particular functionality. It is a data structure that 
stores the security state of communication. A significant advantage of CSC design 
over the DSC design is in the ease of applying different SA schemes, for example 
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Single local object SA scheme is where a single SA is maintained by local 
objects and used for all security needs. 

Multiple local object SAs scheme is where a number of SAs are maintained 
by local objects and used for different security needs but shared between all 
subobjects. 

Single subobject SA scheme is where each subobject of a local object main- 
tains a SA with peer subobjects and use it for all their security needs. 
Multiple subobject SAs scheme is where each subobject of a local object 
maintains a number of SAs with peer subobjects and uses them for different 
security needs. 

In many point-to-point applications, the single local object SA scheme is the 
most likely scenario. Since the subobjects of a local object are maintained in a 
single address space, there are no reliable means from preventing malicious local 
objects from violating the security of other subobjects. Therefore, multiple SAs 
may not be meaningful. 

Different keys may be maintained for different services or security levels but 
shared between each subobject of a local object. If local objects operate on 
environments that provide separate address spaces or other tamper-proof exe- 
cution environment, more complicated and fine-grained keying schemes may be 
relevant. 

Complicated SA schemes occur also in very large DSOs (Fig. 3). Circles 
illustrate local objects that constitute the distributed object. Local objects are 
further grouped into three. 

The core group (Gl) is formulated of those local objects that are most crucial 
to the application, for example sites from which a WWW page can be updated. 
The cache group (G2) is a set of passive sites replicating the service, in this 
example the pages, without altering the content. The client groups (G3a and 
G3b) contain those clients that have at a certain point of time registered with 
a member of the cache group to access the service. Disconnected local objects 
may connect to the distributed shared object in the future. 
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Table 3. Security subobject enforceable security policies 



Policy 


Purpose 


Communication security policy 


Static Communication security 


Security Association 


Dynamic communication security 


Access control policies 


Method invocation control 


Behavior monitoring policies 


Intrusion and misuse detection 


Local policies 


For subobject internal security 



Assume that core group members deliver on line a data item, such as a 
newspaper, software component, or a digital media clip, for which a payment is 
required. In global distribution, the core group objects can not deliver the item 
to millions of customers. Rather a number of caching sites are established and 
clients access cache sites for the service. 

The data item can be protected by encryption and registered clients can 
obtain (maybe once a payment transaction is completed) a cryptographic key 
to recover the item. Key distribution may depend on the level of trust of cache 
group members: 

Untrusted cache is where the core group members do not trust members of the 
cache group. Caches hold encrypted data but can not decrypt it. Clients must 
buy the decryption key directly from core group members or a dedicated key 
server. 

Trusted cache is where the core group members allow cache group members 
to have the encryption key of data elements. This means, data can be stored 
in plaintext on the caches and link encrypted when communicating with a 
client. 

The level of trust of replicas depends on the application domain. Through 
appropriate implementations, the need for a number of SAs can be also reduced. 
Untrusted caches can, for example, share one SA with a core group member and 
another SA with a client group member. With tamper-proof hardware, data can 
be decrypted and reencrypted without disclosure to the caching site. 

As this may be a practical impossibility, the need for flexible SA schemes re- 
mains. The above listed SA schemes can be extended by various group-security 
SA schemes and application specific schemes. The security subobject can in- 
dependently maintain the required SA scheme without violating the security 
model. 



4.5 Security Policy 

The local object design requires a number of security policies as introduced in 
Table 3. System level and managerial security policies are omitted as they are 
beyond the scope of Globe security design. 

The largely static communication security policy that describes the security 
measures to be applied in the communication is implicit. It is constituted by 
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the implementation of the security subobject. Means to achieve higher policy- 
mechanism independence are a major area of future research. 

Lower level security policy describing the ways in which the implemented 
security measures are executed is expressed in a more flexible manner through 
dynamically changing security associations. Negotiation mechanisms of security 
associations in fact are mechanisms for negotiating the security policy and the 
components of a SA used for enforcing the policy. 

Access control policies describe which methods can be invoked by which 
clients under which circumstances. Behavior monitoring policies describe the 
ways in which method invocations are monitored for intrusion and misuse de- 
tection, and how deviations are handled. 

Local policies must be enforced by each subobject internally. The are not 
concerned with the security of communication but the internal security of a 
subobject. For example, prevention of denial of service attacks may require local 
resource allocation policies at each subobject. 

5 Interfaces to the Security Subobject 

The DSO security has two distinctive facets: transformational security to enforce 
the communication security, and access control to impose restrictions on method 
invocations. Interfaces of the security subobject shall be discussed from both 
points of view. Secure operations of a DSO shall then be addressed. 

5.1 Transformational Security 

The security subobject must have a standardized interface and a calling sequence 
from other subobjects. This enables on line adoption of different communication 
security policies by replacing the security subobjects of relevant local objects. 
No alterations to other subobjects are required. 

Methods of the security subobject must be paired so that each method that 
enforces certain security feature is associated with a method the removes or 
verifies the added security. For example, a method encrypting data must be 
paired with a method for decrypting the data. A method for calculating a MAC 
must be paired with a method for verifying the MAC. 

Transformational security is encapsulated into two method calls per subob- 
ject. First method adds the security that the other method removes or verifies. 
Security-adding methods are called by the sending local object and the security- 
removing and verifying methods by the receiving local object. The general calling 
sequence (Fig. 4), is as follows: 

1. A client accesses the local object through the control subobject. 

2. The control subobject marshalls the arguments to the method call and prior 
to passing the marshalled arguments to the replication subobject, calls the 
SecCtrlAdd() method of the security subobject. The method does all the 
security related processing relevant to the control subobject and adds the 
required fields into the argument string. 
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Fig. 4. Security subobject calling sequence 



3. The control subobject passes the processed argument string to the replication 
subobject. 

4. The replication subobject processes the marshalled arguments and invokes 
the SecReplAdd() method that implements the security relevant processing 
and adds the necessary data to the arguments. 

5. The data is passed to the communication subobject. 

6. The communication subobject processes the data and calls the SecCom- 
mAdd() method that implements the required features of communication 
security prior to sending the resulting data through the communication chan- 
nel. 

7. The data is communicated to the other local object. 

8. Upon receipt of data, the receiving communication subobject calls the Sec- 
CommRemove() method of the security subobject that reverses and verifies 
the security measures put on place by the SecCommAdd() method. 

9. The data is processed by the communication subobject and passed to the 
replication subobject. 

10. The replication subobject calls the SecReplRemove() method that reverses 
and verifies the security measures put on place by the SecReplAdd() method. 

11. The replication subobject processes the data and passes it to the control 
subobject. 

12. The control subobject calls the security subobject method SecCtrlRemove() 
that reverses and verifies the security measures put on place by the SecCtr- 
IAdd() method. 

13. The control subobject proceeds with the method invocation. The results are 
passed to the remote client and the security process is repeated. 

Some security methods may be NULL as there may not be security processing 
at each subobject. However, it is imperative that the methods are called in 
the above sequence to enable replacement of the security subobject without 
modifying other subobjects. 
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The interface to the security subobject is simple, and the semantics of the 
byte strings passed as arguments to the methods are determined by the conven- 
tional subobjects. Security processing does not need to be aware of it. 

Each method must receive the arguments through reference and throw a 
GlobeSecurityException or return a GlobeSecurityError code, depending on the 
implementation language. This enables subobjects to standardize the handling 
of security errors. 



5.2 Access Control 

Access control is concerned with which methods of a local object can be invoked 
by which clients under which conditions. For example, access to Write() methods 
may be restricted to the members of the core group, and may only be granted 
after strong authentication and if a certain environmental condition, such as 
time of the day, is met. This requires a sophisticated access control scheme to 
mediate method invocation. This is logically placed in the control subobject. 

Access control functionality can be divided into two layers: credential veri- 
fication layer and access enforcement layer. As the access control decisions are 
made at the control subobject, there is a need to store the results of credential 
verification in a data structure available for the access control decision function. 

The security subobject maintains a Credentials data structure that consists 
of a number of (attribute, value) pairs, where the values of different attributes 
are set by the security subobject once credentials are verified. The access control 
decision function reads the credential data structure, access control rule base, and 
a set of environmental variables needed in access decisions. The access control 
facility can be invoked by the CtrlVerifyAccess(methodlD) method, prior to step 
(13) in Fig 4. 

The access control rule base consists of a number of authorization statements 
describing under which conditions certain clients are allowed to invoke certain 
methods of the semantics subobject. 



5.3 Secure Operations 

Similar methods than those required for security in method invocation are re- 
quired for control messages. Similar to method invocations, control messages 
need to be protected in transmission, and a right to invoke certain control oper- 
ations may be restricted to certain local objects. Separate security associations 
may need to be maintained for control messages. 

Previous discussion has focused on the secure invocation of methods remotely. 
There are also security requirements that are not related to method invocations. 
These requirements are mostly concerned with the dynamic aspects of DSOs, 
most importantly the binding of new local objects to a distributed shared objects. 

Each Globe object has a unique ID. The naming service maps a symbolic 
name to an object ID. The location server can then be queried for the contact 
address of a local object to bind to. Prior to the actual binding, an object class 
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must be retrieved from the local implementation repository, and the local object 
constructed from the object class. 

Name servers, location servers, and implementation repositories may not 
belong to a single administrative domain. Therefore, clients may not equally 
trust all service providers. Measures are required for adequate security at all 
the services. Different from, say, security extensions to the Internet Domain 
Name Server (DNS), naming and location information may not always be pub- 
lic. Therefore, it is unlikely that existing standards can be directly applied in 
Globe. 

Research is currently carried out to investigate the extent to which existing 
infrastructure services can be applied in Globe. 

6 Conclusions 

This paper has analyzed the difficulties in designing security of DSO platforms 
using Globe as a reference system. In particular, the objective of isolating se- 
curity relevant processing from other computations constitutes a fundamental 
design challenge. Yet, it is essential to enable a framework where appropriate as- 
surance of the correctness of security design and implementation can be achieved. 

We have concluded that, despite certain disadvantages, it is better to cen- 
tralize security enforcement into a single security subobject. The method names 
and calling sequences from subobjects can then be standardized to enable re- 
placement of the security subobject without modifying other subobjects. 

The interface of the security subobject consists of three types of methods. 
Transformational security measures are used for protecting method invocations 
and control messages during communication. Access control methods prevent 
unauthorized clients from invoking methods or unauthorized local objects from 
invoking control methods. Other security measures are applied for other security 
operations, such as secure binding. 

The work is currently on progress, and certain applications have been devel- 
oped in Globe and different more advanced applications scenarios are currently 
under research. Further research is also going on in the provision of a high level 
policy-mechanism independence. 

As the Globe is currently implemented in Java, there are certain possibilities 
for replacing the implementation on-line, for example during the binding of a 
new client to an existing local object. 
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Abstract. This paper introduces a cryptographic paradigm called self- 
escrowed encryption, a concept initiated by kleptography. In simple 
words, a self-escrowed public-key cryptosystem features the property that 
the scheme’s public and private keys are connected to each other by the 
mean of an other cryptosystem, called the master scheme. We apply this 
notion to the design of auto-recoverable auto-certifiable cryptosystems, 
a solution to software key escrow due to Young and Yung, and provide 
a new cryptographic escrow system called self-escrowed public key in- 
frastructure. In addition, we give an example of such a system based on 
ElGamal and Paillier encryption schemes which achieves a high level of 
both efficiency and security. 



1 Introduction 

In recent years, considerable research efforts have been invested by the crypto- 
graphic community into the quest for an efficient and fair solution to the key 
escrow problem. Although the widespread use of nowadays communication net- 
works such as the Internet would require the urgent deployment of a large-scale 
key recovery system for law-enforcement purposes, the complexity of the prob- 
lem is such that very few satisfactory proposals have appeared so far. Tamper- 
resistant hardware solutions, such as the Clipper and Capstone chips, arouse the 
users’ suspicion about the (unscrutinized) cryptographic algorithms executed in- 
side the device [12,5,9]; many proposed systems require the escrow authorities 
to get involved in interactive computations at an undesirable level; finally, other 
investigated constructions suffer from not resisting various kinds of attacks (cf. 
shadow-public-key non-resistance [8]) from the system users. 

Young and Yung [15,16] recently introduced the concept of auto-recoverable 
auto-certifiable cryptosystems (ARC), a software-based cryptographic protocol 
that fulfills most of identified desirable requirements. ARCs conjugate function- 
alities of a typical public-key infrastructure (see below for definitions) with the 
ability to escrow private keys of the system users. To achieve this, the certifi- 
cation procedure of a given public key demands the key to be submitted along 
with a publicly verifiable zero-knowledge proof that the escrow authorities can 
efficiently recover the corresponding private key. The proof forms a certificate 
of recoverability which has to be stored securely by the certification authority 
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(CA). If a key recovery procedure is authorized for some suspect user, the es- 
crow authorities query the CA for the matching certificate which allows them 
to completely recover the user’s private key. The same authors also proposed 
a particular embodiment of their concept which relies on ElGamal encryption 
as well as on a specific key generation technique involving an extensive (hence 
costly) use of double decker exponentiations [11]. 

In this paper, we propose a new cryptographic notion which we call self- 
escrowed encryption. We show how to employ this technique to design a crypto- 
graphic protocol that meets all specifications of an auto-recoverable cryptosys- 
tem and presents other additional advantages. In particular, it confers on escrow 
authorities the ability to recover private keys directly from public ones. Conse- 
quently, the storage of some certificate of recoverability is no longer required. 
We call such a system a self-escrowed public key infrastructure (or SE-PKI for 
short). For completeness, we provide an practical example of an SE-PKI which 
is based on the joint use of ElGamal [7] and Paillier [10] encryption schemes and 
achieves a high level of both efficiency and security. 

The paper is divided as follows. The next two sections briefly recall the def- 
initions of a public- key infrastructure and of an auto-recoverable cryptosystem. 
Section 2 introduces the notion of self-escrowed encryption, which is then used 
to define SE-PKIs in section 3. In section 4, we propose a discrete-log based self- 
escrowed encryption scheme and analyze the corresponding SE-PKI in terms of 
efficiency and security. 



1.1 Public-Key Infrastructures 

A public-key infrastructure (PKI) is a distributed cryptographic protocol involv- 
ing system users and trusted third parties called certification authorities (CA). 
Let S = {G, E, D) denote an encryption scheme where G(l^) = (cc, y) is a proba- 
bilistic key generator (for a certain scheme parameter k), and where m i-^- Ey{m) 
and c I— > Dx{c) represent the encryption and decryption functions, respectively. 
A PKI based on 5 is a protocol that fulfills the following specifications : 

1. Setup. CA’s addresses and parameters are published and distributed. 

2. Key Generation. Each user runs G to generate a public/private key pair 

{x, y) and submits y (together with an ID string including personal system 
attributes) to a CA. 

3. Certification Process. The CA verifies the ID string, signs y and enters 

the certified key {y -\- signature) in the public key database. 

4. Encryption. To send a message, a user queries the CA to obtain the public 

key y of the recipient and verifies the CA’s signature on y. If the verification 
holds, the user encrypts the message m using y and sends the ciphertext 
c = Ey{m) to the recipient. 

5. Decryption. The recipient decrypts the ciphertext with his/her private key 

to recover the message m = Dx{c). 
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1.2 Auto-Recoverable Auto-Certifiable Cryptosystems 

The notion was introduced in [15]. The system is a classical public- key infras- 
tructure to which are added escrowing mechanisms. The protocol indeed ensures 
that some escrow agents, called hereafter escrow authorities, are capable of re- 
covering the private key of any user suspected to misbehave. The cryptosystem 
is denoted S = {G, E, D, V, R) where : 

— G(l^) = {x, y, P) is a probabilistic key generator that outputs a public/ pri- 
vate key pair {x, y) and a publicly verifiable non-interactive zero-knowledge 
proof P that x is recoverable by the escrow authorities using P. 

— V {y, P) G {0, 1} is a publicly known algorithm such that (with overwhelming 
probability) V (y, P) = 1 iff x is recoverable by the escrow authorities using 
P. 

— R takes as inputs P and some private information and returns x, provided 
that (x,y,P) is a possible output of G such that V{y,P) = 1. Optionally 
(distributed key recovery), R can also be an m-tuple (Pi, . . . , Rm) such that 
each Ri, run on P and some private input, returns the share Xi of x w.r.t 
some (perfect) secret sharing scheme. Escrow authorities then collaborate to 
recover x. The problem of computing x given (y, P) without R is assumed 
to be intractable. 

An ARC based on 5 is a protocol specified by the following : 

1. Setup. The escrow authorities generate a set of public parameters along with 

the corresponding private algorithm R. The public parameters and CA’s 
parameters are published and distributed. 

2. Key Generation. Each user runs G to generate a public/private key pair 

{x, y) and a certificate of recoverability P. The user then submits the pair 
{y, P) (together with an ID string including personal system attributes) to 
a CA. 

3. Certification Process. The CA checks the ID string and verifies that 

V{y,P) = I. If the verification holds, the CA signs y and enters the certified 
key {y + signature -I- certificate P) in the public key database. 

4. Encryption. To send a message, a user queries the CA to obtain the public 

key y of the recipient and verifies the CA’s signature on y. If the verification 
holds, the user encrypts the message m using y and sends the ciphertext 
c = Ey{m) to the recipient. 

5. Decryption. The recipient decrypts the ciphertext with his/her private key 

to recover the message m = Dx{c). 

6. Key Recovery. If key recovery is authorized for a given user, the escrow 

authorities query the CA for the corresponding certificate P and run R on 
P to recover the user’s private key x. 



2 Self- Escrowed Encryption Schemes 

In this section, we rigorously formalize the notion of self-escrowed encryption, 
initiated in spirit by kleptography [13,14]. 
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The usual way of formally describing a public-key encryption scheme S con- 
sists in decomposing it into three distinct algorithms S = {G, E, D) where G is 
a key generator (e.g. a probabilistic algorithm that outputs a typical key pair 
(x,y) in polynomial-time), and where m i-^- Ey(m) and c i-^- Dx{c) represent 
the encryption and decryption algorithms parameterized by the respective keys. 
To provide one-wayness, the scheme is necessarily built in such a way that the 
public key y is derived from the secret key x by the mean of some compliant 
one-way function y = F(x) such as integer multiplication or exponentiation in a 
well-chosen group . 

A self-escrowed encryption scheme can be defined as an encryption scheme 
for which the function F, in addition to being one-way, also presents partial or 
total trapdoorness. When it does, F can then be expressed as some encryption 
function £y for some existing encryption key Y. This also means that a pub- 
lic/private key pair (x, y) of S such that x falls into the “trapdoorness domain” 
of F reaches the property that x = T>x{y) i-e. there exist some trapdoor in- 
formation X which allows to recover x from y. This property can be captured 
precisely, as follows. 

Definition 1. An encryption scheme S = {G, E, D) is said to be perfectly self- 
escrowed when there exist an encryption scheme X = {Q,£,T>) and a key pair 
(X,Y) of X such that for all key pair (x,y) of S the relation 

y = £Y{x) ( 1 ) 

holds. By analogy with the secret key setting, X is called the master encryption 
scheme of S, Y the master public key and X the master private key. 

As we will see later in the paper, definition 1 may not be reached in a strict 
sense although the given scheme present some a partial access to the self-escrow 
property. In particular, there could exist a master key pair for which most of key 
pairs satisfy relation 1. Situations may also occur wherein the set of escrowable 
private keys remains of reasonable size, although being a negligible proportion 
of the whole private key space. Self-escrow properties can still be defined in that 
case by weakening the strong requirements of definition 1. This is as follows. 

Definition 2. An encryption scheme S = {G , E , D) is said to be (partially) self- 
escrowed when there exist an encryption scheme X = {Q, £, V), a key pair {X, Y) 
of X and a pair of probabilistic polynomial-time algorithms {V,V) such that for 
all key pair (x,y) of S satisfying relation 1, V(Y,x,y) = P is a non-interactive 
(statistical) zero-knowledge proof that relation 1 holds and P is publicly verifiable 
i.e. V{Y,y,P) G {0, 1} equals 1 (with overwhelming probability) if and only if P 
is a valid proof for y. 

In other words, we require that any key pair fulfilling the desired property 
can efficiently be proven such and that the generated proof of recoverability 
is public and can be publicly verified by anyone : a requirement that imposes 
zero-knowledgeness . 

Once more, it is understood that the purpose of kleptographic attacks is no 
different from attempting to turn a target encryption scheme into a self-escrowed 
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cryptosystem. Kleptography is by nature closely related to key recovery tech- 
niques, they differ in spirit only.^ In both cases, the computational dependence 
of public keys on other public keys (be it subliminal or publicly known) seems 
to be necessary. 



3 Self-Escrowed Public-Key Infrastructures 

A self-escrowed public-key infrastructure is a particular case of an auto-recove- 
rable auto-certifiable cryptosystem. The major advantage of an SE-PKI resides 
in that the proof of recoverability generated by the user is verified by the CA 
and then immediately discarded, since it is of no use regarding the key recovery 
procedure. Consequently, this releases certification authorities from the data 
storage of certificates of recoverability needed in ARCs, and completely removes 
interaction with escrow authorities during key recovery. This novel property is 
achieved using self-escrowed encryption as follows. 

Let S = {G,E,D) be a self-escrowed encryption scheme and S = {Q,S,V) 
denote its master scheme. Recall that by definition, there also exist a pair of 
algorithms {V, V) allowing to generate and verify proofs of recoverability. An 
SE-PKI based on S can be defined as follows. 

1. Setup. The escrow authorities run Q to generate a master public/private key 

pair (Y,X). The public parameters, including Y and CA’s parameters, are 
published and distributed. 

2. Key Generation. Each user runs G{Y, 1^) to generate a public/private key 

pair {x, y), and then runs V{Y, x, y) to get the proof P that y = £y{x) holds. 
The user then submits the pair (y, P) (together with an ID string including 
personal system attributes) to a CA. 

3. Certification Process. The CA checks the ID string and runs V to verify 

that V(K, y, P) = 1. If the verification holds, the CA signs y and enters the 
certified key {y + signature) in the public key database. 

4. Encryption. To send a message, a user queries the CA to obtain the public 

key y of the recipient and verifies the CA’s signature on y. If the verification 
holds, the user encrypts the message m using y and sends the ciphertext 
c = Ey{m) to the recipient. 

5. Decryption. The recipient decrypts the ciphertext with his/her private key 

to recover the message m = Dx{c). 

6. Key Recovery. If key recovery is authorized for a given user, the escrow 

authorities recover the user’s private key x = T>x{y) and decipher the trans- 
mitted ciphertext (s). Here again, the key recoverability may be distributed 
among escrow authorities using some threshold decryption scheme. 



^ the kleptographic adversary wishes the very existence of the ability to recover keys to 
be secret. A subliminally self-escrowed encryption scheme is called SETUP, see [13]. 
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It is worthwhile noticing that an SE-PKI remains extremely close by con- 
struction to a regular PKI : a given PKI is self-escrowed iff the underlying en- 
cryption scheme is also self-escrowed, the two properties directly derive from 
each other. 

4 An Efficient Self-Escrowed PKI 

We now proceed to describe a practical example of a SE-PKI. As just pointed 
out above, this requires to set up a self-escrowed cryptosystem first. Our scheme 
proposal is based on the joint use of ElGamal and Paillier encryption schemes 
and achieves (partial) self-escrow in the sense of definition 2, as will be shown 
later. We begin by a brief overview of useful mathematical facts. 

4.1 Self-Escrowed Discrete Log-Based Cryptosystems 

Paillier Encryption. Recently, Paillier [10] introduced public-key probabilistic 
encryption schemes based on composite residuosity classes over Z *2 where n is 
an RSA modulus n = pq. To briefly describe the trapdoor, the knowledge of the 
factors of n happens to allow a fast extraction of discrete logarithms modulo n^, 
provided that the base g € Z *2 is of order na for some a with gcd(n, a) = 1. In 
the sequel, g will be chosen of maximal order n\ where A = A(n) = lcm(p— l,q — 
1) must be relatively prime to n. We define over Un = {u < v? |m= 1 mod n} 
the integer- valued function L(u) = (u—l)/n where the division takes place in Z. 
The public key is then the pair (n, g) while the private key is A or equivalently 
the factors p and q. Encryption of a plaintext m < n is done as follows. Pick an 
integer r uniformly at random in [0, 2^] where (. denotes the bitlength of n, and 
compute the ciphertext 



c = 5 ’”+” ’' mod . 



(2) 



To decrypt, compute 



L(c^ mod n^) 
mod IT?) 



mod n . 



(3) 



The one-wayness of the scheme is known to be equivalent to the partial discrete 
logarithm problem with base g, which is thought to be intractable provided 
that n is hard to factor. We refer the reader to [10] for more details. In this 
paper, we will be considering a deterministic version of this encryption scheme, 
i.e. encryption of a message m < n is done by a simple exponentiation c = 

mod while decryption is carried out as in equation 3. The encryption 
scheme is depicted on figure 1. 

From a theoretical viewpoint, making the cryptosystem deterministic some- 
how decreases its security level, since computing partial logarithms then reduces 
to computing simple discrete logarithms^. However, since we do not know any 

^ the scheme also looses semantic security. 
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Public Key 


n, g of maximal order. 


Private Key A = lcm(p — l,q — 1). 


Encryption 


plaintext m < n 
ciphertext c = p™ mod 


Decryption 


ciphertext c < ri^ 




L(c^ mod n^) 

plaintext m = — ^ mod n. 

mod ) 



Fig. 1. Paillier’s Deterministic Encryption Scheme. 



way of extracting discrete logs without the secret factors, we will make the 
assumption that inverting the encryption function still remains an intractable 
problem in this context. 



Self- Escrowed Diffie— Heilman. The celebrated Diffie-Hellman key exchange 
protocol [3] exploits the (conjectured) hardness of extracting discrete logarithms 
over Z* (for some well-chosen large prime p) and the additive homomorphicity of 
modular exponentiation. The protocol can straightforwardly be executed using 
other kinds of groups over which the discrete log problem is also thought to be 
intractable : we focus here on the specific group Z* 2 where n is an RSA modulus 
n = pq just as before. As pointed out above, the knowledge of the factors p and 
q is sufficient to recover the discrete logarithm of y = mod provided that 
X < n. This leads to a simple self-escrowed Diffie-Hellman variant (cf. figure 2) 
in which some escrow authority, whose private key is (p, q), can easily open the 
session key after wiretapping data exchanged during the protocol. 



Setup 

The escrow authority generates n — pq and 
publishes n. 

Protocol 

1. User A picks a random a < n and sends 
mod to user B 

2. User B picks a random b < n and sends 
mod to user A 

3. Both users compute K — g°‘^ mod n^. 

Key Recovery 

If key recovery is authorized, the escrow au- 
thority computes a (or b) from wiretapped 
(or g^) and easily recovers K. 



Fig. 2. Self-Escrowed Diffie-Hellman Key Establishment. 
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Self- Escrowed ElGamal Encryption. Because ElGamal encryption is a non- 
interactive instance of the Diffie-Hellman protocol, the self-escrow property 
shown above also extends to ElGamal encryption over Z* 2 • The resulting cryp- 
tosystem is displayed on figure 3 below. It clearly appears that the encryption 
(ElGamal) possesses a master scheme (Paillier). 



Master Public Key 


n, g of maximal order, l = 2\n\. 


Master Private Key A 


Public Key 


y = mod where x <r n 


Private Key 


X < n 


Encryption 


plaintext m < 

ciphertext c = {my^ ,g^) where k <r 2^ 


Decryption 


ciphertext c = (a, b) 
plaintext m = a/b^ mod 



Fig. 3. Self-Escrowed ElGamal/Paillier Encryption Scheme. 



We claim : 

Theorem 1. The encryption scheme of figure 3 is self-escrowed. 



Proof (Construction of T’ and V). To comply with definition 2, we still 
have to exhibit algorithms V and V with the desired properties. In our context, 
we clearly have Y = (n,g) and Ey{x) = g^ mod n^. Therefore the proof P to 
be generated must actually be a proof that x < n given y = g^ mod n^. To 
achieve this, we now introduce a specific proof technique. We refer the reader 
to Gamenish and Michels’ recent work [1] for an exhaustive overview of interval 
proofs for discrete logarithms. 

We first consider the situation in which a prover interactively proves to a 
verifier that some element y G (g) is such that y = g^ mod for some a; < n of 
his knowledge. Suppose the two parties first engage in a set-up phase as follows. 
The verifier randomly chooses a large prime po such that n divides po — 1 , 
generates a primitive root 7 of Z*^ and sets 

( hi = yfpo-i)/" mod Po 
\h 2 = hf mod Po 

for some^ a <r n. The parameters po^ h\ and /12 are sent to the prover who 
checks that po is prime and that h\ and /12 have order n modulo po- These 
concludes the set-up phase. Now, the prover chooses a random z <r n, computes 

® a <R b denotes that a is chosen in the interval [0, fe — 1] with uniform distribution. 
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y = /if /if mod po and sends y to the verifier. Finally, the two parties engage in 
the protocol 



7T = PK{{x, z) : y = mod n"^ Ay = /if /if mod po} (4) 

which ensures the verifier that the wanted property a; < n is fulfilled. Note that 
protocol 7T must be carried out by the parallel execution of the two protocols 

f 7Ti = PK{{x) : y = mod n^} and 
f 7T2 = PK{{x, z) : y = mod po} , 

together with the same challenge. In virtue of a result due to Fujisaki and 
Okamoto [6], tti will work only if the strong-RSA assumption holds over Z *2 : 
we will therefore make this hypothesis in what follows. We refer the reader to 
figure 4 for an insight into the complete protocol. 



Prover 




Verifier 


Po prime? 




Po prime such that 


9 

h" = 1 mod Po 




n divides po — 1 


9 

/i2 = 1 mod Po 

ro <R 2 ^ 
z,ri,T2 <R n 
to — mod 


{po,hi ,h2) 


/ii , /i2 £ 'hpg of order n 




ti — mod po 

y = /if /if mod Po 


io-tl-S, 




50 = Co + ex 

51 = ri -I- ez mod n 


e 




S2 = C2 + ex mod n 


(so,si,S2) 


to = g“° ly'^ mod 
ii = /ii^/if^/y'^ mod Po 





Fig. 4. An Interactive SZK Proof that y = g^ mod with x < n. 



A part of the prover’s response (sq) being computed in Z, we have to make 
sure that \ex\ is far smaller than |ro| e.g. T should be small before |n| so that 
no information on x can leak out of sq- The proof is then complete. We now see 
that there is no need for the set-up parameters {po,hi,h 2 } to be random : they 
can be advantageously replaced by absolute constants (depending on n only) 
i.e. they can be chosen once for all according to the above requirements and 
then included as a part of the public key Y. It is however necessary to ensure 
at this level that the prover chooses the random value z on his own so that the 
verifier cannot gain any information whatsoever about x from the knowledge of 
y = /if /if mod Po- 
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We know turn the protocol into a non-interactive zero-knowledge proof by 
applying the so-called Fiat-Shamir heuristic [4]. By doing so, we determine the 
challenge e by applying some collision-resistant hash function H to the commit- 
ment. In our case, this also determinates the two algorithms V and V that we 
have been looking for. These are depicted on figure 5. 



Parameters 


n, 1 = 2\n\, g £ Z *2 of maximal order, 
po a prime such that n divides po — 1, 
hi and h .2 of order n in 


Generation V 


Input 


X < n and y 


= mod n^. 


Output 


1. pick z <R n and compute y — hlh^ mod po 

2. pick ro <r2^ and ri,C 2 <r n 

3. compute e = H {jf° mod v? , mod po) 

f So = ro -1- ea; 

4. compute < si = ri -|- ez mod n 

1 S 2 = C 2 -1- ea; mod n 

the proof is P = {y, e, so, si, S 2 ). 


Verification V 


Input 


y < and P = (p, e, so, si , S 2 ). 


Output 


check whether e = H {g“° /y^ mod v? , mod po) 

0 or 1 according to the verification result. 



Fig. 5. Generation and Verification of a Proof of Recoverability. 



Note that previous works on range-bounded commitments such as [2] could 
also lead to a suitable pair of algorithms for V and V. □ 

Combining our self-escrowed encryption scheme with the specifications of sec- 
tion 3 now gives us a self-escrowed public key infrastructure. We sum up hereafter 
the main features and properties that characterize our cryptosystem. 

4.2 Security Aspects 

One-Wayness. The problem of inverting our self-escrowed encryption function 
is equivalent to the Diffie-Hellman problem with base g over Z* 2 . 

Semantic Security IND-CPA. The semantic security of the scheme is equiv- 
alent to the Decision Diffie-Hellman problem with base g over Z* 2 . 

Extraction of the Private Key. The problem of computing the private key x 
from the public key y = g^ mod i.e. inverting the master encryption scheme 
is equivalent to the discrete log problem with base g over Z* 2 . 
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Extraction of the Master Private Key. Since the master private key is A 
or equivalently the prime factors of n, computing the master key exactly means 
factoring n. 

4.3 Efficiency 

Each step of our SE-PKI involves a few modular exponentiations at most for 
each party. We will provide a further analysis of computational workloads in the 
final version of the paper. 

4.4 Other Cryptographic Features 

Soundness. Key recovery can be successfully performed over any certified user 
key with overwhelming probability : this is ensured by definition of algorithms 
V and V. 

Traceability. We conclude on a technical remark illustrating that our cryp- 
tosystem allows user traceability under certain circumstances : suppose that 
some known user, say user A, is suspected to regularly encrypt illegal docu- 
ments that he or she sends through a global computer system to an unknown 
recipient user B. Further assume that police forces’ action priority is to identify 
user B. Of course, tracing the transmitted data packets themselves to locate 
the recipient is assumed to be unfeasible (mixed network f.i). Police could ask 
escrow authorities to recover user A’s private key but this would be of no help. 
The contents of the transmitted information are known (this is a typical sensitive 
information thieving scenario) . What could be done? 

Suppose you catch a single ciphertext c = {my^,g^) (by wiretapping the 
sender). Since m is known (or guessable), you easily compute query an 

escrow authority for disclosing the exponents and get the two values xk mod n 
and fc mod n. You then infer x = xk/kmodn which leads to the recipient’s 
public key y = g^ mod v?. Finally, querying the public-key database will give 
user B’s ID. 

4.5 Open Research 

A typical research topic would be to control private key generation in such a way 
that the overall cryptosystem achieves shadow-public-key resistance; we believe 
this property can be achieved at moderate cost. Also, it would be of interest to 
investigate how our proposal fits the notion of escrow hierarchy introduced by 
Young and Yung in [16]. 
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Abstract. In this paper, we propose Domain-verifiable signcryption 
scheme, which is applied to the Electronic Funds Transfer(EFT) proto- 
col, that only predetermined n participants within the domain of proto- 
col participants can decrypt their own part of message and verify whole 
transaction. The computational cost of our scheme is as low as that 
of Zheng’s scheme assuming that Trusted Third Party(TTP) must be 
used to keep partial information for participants confidential and multi- 
verification. Our scheme does not require the role of TTP. 



1 Introduction 

The Electronic Funds Transfer(EFT) protocol is most widely used for transfer- 
ing money between the financial institutions. The protocol requires both confi- 
dentiality and authentication services simultaneously. Efficiency is a factor that 
must be fulfilled in financial systems. The efficiency is achieved by applying sign- 
cryption scheme to EFT protocol. The Signcryption [9], which is first proposed 
by Zheng, is a new cryptographic primitive called “catch two birds with single 
stone” scheme. This simultaneously fulfills both the functions of signature and 
encryption in a single logical step, and reduces computational cost which is sig- 
nificantly lower than that required by the traditional signature-then-encryption 
paradigm [3,6,9]. 

In application to EFT protocol in multiple participants environment, the 
signcryption scheme needs modification so that only predetermined n partici- 
pants within a domain can decrypt their own part of message and verify whole 
transaction. We call this modified signcryption scheme Domain-verifiable sign- 
cryption scheme where domain means a set of participants involved in a trans- 
action protocol. In Zheng’s signcryption scheme, the unsigncryption (decryption 
and signature verification) needs the recipient’s private key; therefore, only the 
recipient can verify the signature. So, Zheng’s signcryption schemes have some 
constraints to be used in applications where a signature needs to be validated 
by any others. To overcome this problem, Bao and Deng[l] modified Zheng’s 
signcryption scheme such that verification of a signature no longer needs the 
recipient’s private key. However, Bao and Deng’s scheme is not as efficient com- 
putationally as Zheng’s scheme. Also in their scheme, the message must be de- 
crypted before it is verified by other people ending up losing confidentiality. To 
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maintain the confidentiality and also to be used in firewall application, Gamage, 
Lei wo and Zheng [4] proposed the signcryption for third-party verification. But 
in this scheme, whereas any verifier can verify the signature, only one person 
can obtain the whole plaintext message. 

In EFT protocol usage, there exist many participants for one transaction. 
A transaction consists of secret information to be processed by each partic- 
ipant. Each participant requires confidentiality for his own secret information. 
Also all participants need authentication of that whole transaction. Signcryption 
schemes [1,4, 9] proposed previously cannot be directly used in this situation. 

In this paper, we propose Domain- verifiable signcryption scheme based on 
Gamage, Leiwo and Zheng’s signcryption that can be easily applicable to the 
EFT and Secure Electronic Transaction (SET) protocol)?] that many participants 
within domain can keep their own part of message confidentially and verify 
the whole transaction. Also we sketch EFT protocol between two banks using 
the Domain-verifiable signcryption. The computational cost of our scheme is 
as low as Zheng’s scheme with assuming that Trusted Third-Party (TTP) must 
be used for keeping partial information for participants confidential and multi- 
verification. When we use Domain- verifiable signcryption, we can construct EFT 
protocol without interaction of TTP. 

The rest of the paper is organized as follows. The signcryption schemes pro- 
posed until now are described briefly in Section 2. The proposed scheme for 
domain- verification is discussed in Section 3. Section 4 provides the application 
of our scheme with financial EFT protocol. Finally concluding remarks are given 
in Section 5. 



2 Related Work 

We describe three signcryption schemes proposed until now. The original sign- 
cryption primitive proposed in [9] by Zheng combines the sign-then-encrypt two 
step process to create a secure authenticated message into a single logical step 
with significant savings in both computational and transmission costs. A dis- 
advantage for some applications such as EFT protocol in which more than two 
participants involved is that only the intended recipient can verify the message. 
A modified signcryption scheme was proposed in [1] by Bao and Deng to over- 
come this limitation. But it has the increased computational cost while still 
preserving the transmission cost savings achieved by the original scheme. Two 
disadvantages of this modified signcryption scheme are: 

- The signature verification-only mode of operation can be used only after the 
original recipient has recovered the plaintext message. 

- The plaintext message must be forwarded to a third party for signature 
verification and the message confidentiality can be lost. 

In [4], Gamage, Leiwo and Zheng modified Bao and Deng’s scheme to carry out 
signature verification without accessing the plaintext for preserving confidential- 
ity of the original message without altering sign-then-encrypt paradigm. But in 
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this scheme, whereas any verifier can verify the signature, only one person to 
unsigncrypt signcrypted message can obtain the whole plaintext message. 

Therefore, these all schemes could not be applied directly for EFT proto- 
col which transaction consists of partial information for each participant that 
requires confidentiality about his own information even against other protocol 
participants. 

2.1 Zheng’s Scheme 

Task: Alice has a message to send to Bob. Alice signcrypts it so that the effect 
is similar to signature-then-encryption. 

Public Parameters: 

p: a large prime 
q: a large prime factor of p — 1 
g: an element of Z* of order q 
hash: a one-way hash function 
KH: a keyed one-way hash function 

(A, D): the encryption and decryption algorithms of a symmetric key cipher 

Alice’s Key: 

Xa € Z*: Alice’s private key, ya = mod p: Alice’s public key 

Bob’s Keys: 

Xb & Z*: Bob’s private key, yt = g^'‘ mod p: Bob’s public key 

Signcrypting: Alice randomly chooses x G Z* then sets 
(fci, ^ 2 ) = hash{yb^ mod p) 
c= Ek^{m) 
r = KHk^irn) 
s = x/{r + Xa) mod q. 

Alice sends (c, r, s) to Bob. 

Unsigncrypting: Bob computes 
(fci, ^2) = hash{{yag''Y'^'‘ mod p), 

m = Dki (c) to recover the plaintext message, and then checks whether (rn) = 

r for signature verification. In unsigncrypting process, it is straightforward to 
see that xt is involved for signature verification. 

2.2 Bao and Deng’s Scheme 

Signcrypting: Alice randomly chooses x G Z* then sets 
ki = hash{yb^ mod p) 
k = hash{g^ mod p) 
c= Afcj(m) 
r = KHk{m) 
s = x/{r + Xa) mod q. 

Alice sends (c, r, s) to Bob. 
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Unsigncrypting: Bob computes 
h = {Vag'^y mod p 
t 2 = mod p 
k\ = hash{t2) 
k = hash{t\), 

m = Dki(c) to obtain the plaintext message, then checks whether KHk{m) = r 
for signature verification. 

Later when necessary, Bob may forward (m, r, s) to others, who can be con- 
vinced that it came originally from Alice by verifying k = hash{{yag'^Y mod p) 
and r = KHk{m). 

In this signature verification, verifiers require to get the plaintext message. 

2.3 Gamage, Leiwo and Zheng’s Signcryption for Third-Party 
Verification 

Signcrypting: Alice randomly chooses x G Z* then sets 
k = hash{yiZ mod p) 
y = g^ mod p 
c = Ek{m) 
r = hash{y, c) 
s = x/{r + Xa) mod q. 

Alice sends (c, r, s) to Bob. 

Unsigncrypting: Bob will compute from (c, r, s) 
y = ivag'^y mod p 
k = hash{y^’‘ mod p), 

m = Dk{c) to obtain the plaintext message. 

Bob accepts signature if and only if hash{y, c) = r. 

For partial unsigncryption with signature verification-only, any verifier will 
compute from (c, r, s) and y = {yag^Y mod p. 

Any verifier accepts signature if and only if hash{y, c) = r. 

This signature verification does not require access to the plaintext message. 

3 Domain- Verifiable Signcryption Scheme 

Within domain of protocol participants, each participant wants to be maintained 
his own message included in transaction secretly even against any other partici- 
pants. Also, all participants require to authenticate the transaction that consists 
of participants secret partial information. We construct the Domain-verifiable 
signcryption scheme that satisfys these requirements. Each participant can de- 
crypt just his own message and all participants can verify the whole transaction. 
This scheme could be applied to EFT protocol as well as any other protocols 
like SET protocol that need to be kept participant’s partial information secret 
and to be authenticated total message by all participants simultaneously. 
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3.1 Scheme for Domain Verification 

For consistency, we use the same notations as in Zheng’s scheme except recipi- 
ents’ key. 

Recipient Bi’’s Keys within Domain of n Participants {i G {1, . . . , n}) 

Xbi G Z*\ Bi's private key 

ijhi = mod p: Bi’s public key 

Signcrypting: Alice randomly chooses x G Z* then sets 

ki = hash{yb^’^ mod p),k 2 = hash^yi,^’^ mod p),. . .,kn = hash{yb^’^ mod p) 

k = hash{g^ mod p) 

Cl = Afcj(mi),C2 = Ek^{m2 ), . . .,c„ = EkS^n) 
ri = KHk{mi\\c2\ \ • • ■\\cn),r2 = KHkici\\m2\ \ ■ ■ ’Wcn), 
r„ = KHk{ci\\c2\ \ ■ ■ • ||to„) 
s = xj{rir 2 ■ ■ - rn + Xa) mod q. 

Alice sends (ci, C 2 , . . . , c„, ri, r 2 , . . . , r„, s) to 

Unsigncrypting: Recipient Bi computes 
i = ( 2 / 05 ’'^’'’^' "’'”)® mod p 
ti = mod p 
k = hash{t) 
ki = hash{ti), 

ixii = Dki(ci) to obtain Bi’s own plaintext message, then checks whether 
KHk{ci\ \ • • • \ \mi\ \ ■ • ■ \ \cn) = Ti for signature verification. 

Later when necessary, Bi may forward (ci, C 2 , . . . , c„, ri, r 2 , . . . , r„, s) to any 
other participants, who want to decrypt his own message and can be convinced 
that it came originally from Alice by executing through this unsigncrypting. 

3.2 Performance and Security 

We should consider a situation where Domain-verifiable signcryption scheme 
must be used. If we use Zheng’s scheme, TTP must be involved to divide mes- 
sage into partial messages for each participant and signcrypt the partial message 
for the corresponding participant [10]. But our Domain- verifiable signcryption 
scheme does not need TTP. While considering only exponentiation cost as the 
computational cost and n participants. Domain-verifiable signcryption requires 
n -|- 1 modulo exponentiations for signcryption and 3n modulo exponentiations 
for unsigncryption. In the general case of n participants more than 2 or 3 par- 
ticipants are involved, the communication bandwidth of our scheme is not lower 
than that of the Zheng’s scheme, since the whole transaction message for n 
participants must be always transferred. 

It can be done only within domain of protocol participants to unsigncrypt 
message, since a participant Bi having his own secret key Xbi within a domain 
can obtain partial information rrii and the only person who gets rrii can try 
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to check if KHk{ci\ \ • • • | |mi| | • • • | |c„) = r* for signature verification. Any other 
persons that have not secret Xbi will not be able to take part in unsigncryption. 

In [4], they not only provide the formal proof based on the random ora- 
cle model about the security argument about the computation of two values, 
Ub^ mod p and mod p using the same secret x, but also show the pseudo- 
independence of two computed values as an adequate guarantee of security for 
the signature scheme. Namely, if a signer chooses the integer x uniformly and ran- 
domly, then two values are (pseudo) independent as both g and yb = g^'’ mod p 
are generators in Z* of order q which is a prime. This ensures that the signature 
verification and partial recovery of bits does not leak information that can be 
used in an attack on breaking message confidentiality or signature forgery. We 
can consider to apply this method to our scheme. According to this security anal- 
ysis, if a signer chooses the integer x uniformly and randomly, then n -I- 1 values 
such as ybi “ mod p, ■ ■ yb„^ mod p and g^ mod p in Domain- verifiable signcryp- 
tion are (pseudo) independent as g, yb^ = g^’^^ mod p, ■ ■ ■, yb^ = <7^'’” mod p are 
generators Z* of order q which is a prime. This guarantees that Domain- verifiable 
signcryption scheme has message confidentiality and signature unforgeability. 



4 EFT Protocol Based on Domain- Verifiable Signcryption 

EFT is considered to be any transfer of funds, other than a transaction by 
check, draft, or similar paper instrument, that is initiated through an electronic 
terminal, telephone, computer or magnetic tape for the purpose of ordering, 
instructing, or authorizing a financial institution to debit or credit an account. 
In the inter-bank EFT protocol, withdrawal accounts and deposit accounts are 
placed in different banks. A client should request EFT transaction to the bank 
that has business relations with him. The bank that receives the request draws 
the corresponding money from the requester’s account and asks the deposit bank 
to deposit the same amount of money to recipient’s account. The withdrawal 
bank that receives the result of deposit from a deposit bank informs the client 
who requests the EFT transaction of the final result of the funds transfer [2] . 

The message that clients send to the withdrawal bank will be constituted of 
client’s information such as his own account number and PIN (Personal Identifi- 
cation Number), and recipient’s information such as deposit bank name, deposit 
account number and amount of money to be transferred, etc. This message has 
to be encrypted and signed for privacy and integrity. In detail, client’s informa- 
tion is encrypted for withdrawal bank and recipient’s information is encrypted 
for deposit bank. Also the transaction for EFT protocol has to be authenticated 
by both withdrawal and deposit banks. 

To use signcryption scheme at the inter-bank EFT protocol, we need TTP 
when using Zheng’s scheme. But when using Domain-verifiable signcryption, we 
don’t need TTP as shown in Fig. 1. 
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Zheng's Scheme Using TTP 



Domain-Verifiable Scheme 



signcryption (Ml ) Signcryption (M2 ) 




Signcryption (M) Signcryption (M) 




TTP 


Trusted Third Party- 


BW 


Withdrawal Bank 


BD 


Deposit Bank 


M 


Message for EFT 




transaction 




(M=M1 1 |M2) 


Ml 


: Withdrawal Information 


M2 


: Deposit Information 



Fig. 1. EFT protocol using the signcryption schemes 



4.1 Inter-Bank EFT Protocol 

We use the following notations to describe this protocol. 

Participants and Tools 

Client: A 

Withdrawal Bank: BW 
Deposit Bank: BD 

SigncryptA{*)'- Domain- verifiable signcryption by client A including signature- 
only mode [9] 

UnsigncryptA{*)'- Domain- verifiable unsigncryption by client A 
SignA{*)'- signature-only mode of signcryption by client A 
II : message concatenation 
hash(u): hash algorithm 

Preparation 

Creation of funds transfer information: M = M\ \ \ M2 \ \ COM 

- Ml'. Client T’s information such as withdrawal account number and PIN, 
encrypted for withdrawal bank 

- M2: Deposit information such as deposit bank and deposit account number, 
encrypted for deposit bank 

- COM: Common data for EFT such as amount of money to be transferred, 
date, sequence number and recipient’s name, etc. This data should be main- 
tained as plaintext for the transaction processing. 
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Transfer Protocol 

1 . Client A generates SM = (a, C2, COM, n, r2, s) where ci = Ek^{Mi),C2 = 
Ek,{M2),ri = KHkiMi\\c2\\COM),r2 = KHkici\\M2\\COM) and s = 
xl(r\T2 + xa) mod q through SigncryptA{M) and then A sends SM to the 
withdrawal Bank, BW. 

2 . BW processes U nsigncrypt a{S M) to decrypt his own message Mi from ci 
and verifies SM. 

3 . After BW checks whether if the request is replayed by date and sequence 
number in the message, BW draws money from A’s account in M\ and sends 
SM to deposit bank BD. 

4 . BD processes U nsigncrypt a{SM) to decrypt his own message M2 from C2 
and verifies SM. 

5 . After BD checks whether if the request is replayed by date and sequence 
number in the plaintext COM, BD deposits money to the corresponding 
account using the decrypted M2. 

6 . BD generates r = SignBD{SM\\Res\At of Deposit) and then sends 
(Result of Deposit, r) to the BW. 

7 . BW does the necessary job according to the result of deposit that received 

from BD and generates r = (Result of Transfer). And then BW 

sends (Result of Transfer, f) to client A. 

8 . Client A can use the received (Result of Transfer, f) as receipt for counter- 
part of transfer. 



4.2 Security Consideration 

The security of the inter-bank EFT protocol based on Domain-verifiable sign- 
cryption is summarized as below. 

- Confidentiality: An adversary cannot recover the message M that transferred 
between a client and the banks because that message is encrypted for the 
corresponding bank before the transmission. Specially PIN in Mi, client’s 
secret information for making withdrawal is not compromised by any others 
except only withdrawal bank. 

- Authentication and Integrity: To send a fund transfer message, a client must 
sign on that message using his own private key. The banks that received a 
fund transfer message can authenticate the client who sends that message us- 
ing the private key for the client. Also the banks can determine the integrity 
of the received message, since that message is signed by the client. 

- Non-repudiation: A client’s signature on the fund transfer message for a 
transaction can be used for the evidence[ 5 , 8 ] of an user’s request for EFT. 

- Replay attack: If an adversary tries to replay the protocol, the bank can 
detect the message replayed by checking whether if the date and sequence 
number in the message are duplicated with the message that already has 
received. 




EFT Protocol Using Domain- Verifiable Signcryption Scheme 277 



- Usage as receipt: The result of a transfer along with signature from the bank 
can be used as a receipt for the result of funds transfer to the recipient. The 
recipient can verify the receipt that received from requester of funds transfer 
using the bank’s public key. 

5 Concluding Remarks 

We proposed Domain-verifiable signcryption scheme applicable to the situation 
that only predetermined n participants can decrypt and verify within a domain. 
This scheme is useful when each participant can decrypt his own message that 
is partial information of the whole transaction message and all participants can 
verify the whole transaction message. 

As an example application, we designed inter-bank EFT protocol based on 
the Domain-verifiable signcyption scheme. We found that this inter-bank EFT 
protocol is so efficient that it can be used at the real world. The detailed designs 
of multi-level hierarchical key distribution or SET protocol based on our Domain- 
verifiable signcryption need further research. 
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